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Abstract 


In  this  research  we  focused  on  intertwined  low-level  motion  planning  and  high-level  task  assignment 
for  a  cooperating  team  of  heterogeneous  unmanned  assets.  As  a  case  study  we  are  envisioning  the 
scenario  where  a  team  of  unmanned  aerial  vehicles  (UAVs)  has  to  query  a  set  of  unattended  ground 
sensors  (UGSs),  positioned  along  a  road  network.  In  such  a  scenario  the  UAV  team  cooperates  in 
searching  and  prosecuting  ground  target  intruders,  identified  by  the  ground  sensors.  Of  particular 
interest  to  us  was  the  cooperation  and  coordination  between  the  unmanned  aerial  vehicles  in 
devising  their  assignments  and  trajectories. 

The  research  was  conducted  in  two  phases.  In  the  first  phase  we  have  concentrated  on  the 
motion  planning  and  task  assignment  aspects  of  the  problem  where  a  single  UAV  has  to  decide  on 
the  query  ordering  of  the  UGSs,  given  constraints  on  its  motion.  This  problem  was  formalized  as 
the  Dubins  travelling  salesman  problem  (TSP).  The  contributions  of  the  study  in  the  first  phase 
can  be  summarized  as  follows: 

•  Two  new  algorithms  for  the  Dubins  TSP  were  developed  that  are  applicable  to  both  ordered 
and  unordered  sets  of  targets.  Instead  of  being  decoupled,  the  combinatorial  and  motion 
planning  aspects  of  the  Dubins  TSP  were  treated  in  an  integrated  manner  and  no  assumptions 
were  made  on  the  magnitude  of  the  intercity  distances.  The  two  algorithms  complement  each 
other  in  terms  of  their  range  of  applicability  with  regard  to  the  size  of  the  problem. 

•  The  first  algorithm  —  clubbed  “A;-step  look-ahead  algorithm”  —  stems  from  the  formulation 
of  the  Dubins  TSP  as  a  minimum-time  control  problem  and  is  suitable  for  obtaining  short 
tours  when  the  number  of  cities  is  relatively  small. 

•  The  second  algorithm  is  an  adaptation  of  the  classic  2-Opt  algorithm  for  the  TSP  and  can 
be  applied  to  large  instances  of  the  Dubins  TSP. 

•  Analytical  bounds  were  derived  for  the  lengths  of  the  tours  and  for  the  complexity  of  the 
problem.  Monte  Carlo  simulations  were  used  to  investigate  and  compare  between  the  per¬ 
formance  of  the  proposed  algorithms  and  existing  algorithms  from  the  literature. 

In  the  second  phase  of  the  research  we  have  paid  special  attention  to  the  coupled  motion 
planning  and  task  assignment  problem  for  groups  of  UAVs  and  UGSs.  The  specific  problem 
considered  includes  allocating  the  group  of  UAVs  to  a  given  set  of  UGSs  while  accounting  for 
the  vehicles  kinematic  constraints  and  avoiding  collision  with  obstacles  scattered  in  the  vehicles’ 
environment.  The  contributions  of  the  study  in  the  second  phase  can  be  summarized  as  follows: 

•  To  account  for  the  effect  of  a  diminishing  return  with  time  for  information  collected  at 
each  UGS,  each  UGS  is  assigned  a  specific  time  dependent  benefit.  The  benefit  represents 
the  UGS  importance  and  priority,  possibly  assigned  by  a  human  operator.  The  time  varying 
priority  issue  is  addressed  by  incorporating  the  vehicles’  feasible  path  length,  which  represents 
the  vehicles’  response  time,  and  the  targets’  properties  as  an  inherent  part  of  the  problem 
formulation. 

•  The  integrated  problem  of  task  assignment  and  motion  planning  was  posed  in  the  form  of  a 
decision  tree  and  two  solution  algorithms  were  developed.  An  exhaustive  search  algorithm 
which  improves  over  run  time  and  provides  the  best  solution  encoded  in  the  decision  tree, 
and  a  greedy  algorithm  that  provides  a  quick  feasible  solution  (also  used  as  an  upper  bound). 
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•  Although  the  greedy  algorithm  provides  a  sub-optimal  solution,  it  is  useful  in  large  scale  real 
time  scenarios,  where  computational  running  time  is  important.  The  exhaustive  algorithm 
can  provide  an  immediate  solution  that  improves  over  run  time  for  large  scale  scenarios,  or 
it  can  be  used  in  off-line  scenarios. 

•  Using  simulations  the  performance  of  the  algorithms  was  compared  and  evaluated,  and  the 
influence  of  the  time  varying  targets’  priority  on  the  task  allocation  process  was  demonstrated 
and  investigated. 

The  results  of  the  research  have  appeared  in  2  conference  publications,  1  submitted  journal  pub¬ 
lications,  and  1  journal  paper  that  is  currently  in  preparation  stages. 

The  published  conference  papers  are: 

•  Pantelis,  I.  and  Shima,  T.,  “A  Task  and  Motion  Planning  Algorithm  for  the  Dubins  Travelling 
Salesperson  Problem”,  Proceedings  of  the  19th  1FAC  World  Congress,  Cape  Town,  South 
Africa,  August  2014. 

•  Gottlieb,  Y.  and  Shima,  T.,  “Task  Assignment  and  Motion  Planning  in  the  Presence  of 
Obstacles  and  Prioritized  Targets” ,  Proceedings  of  the  54th  Israel  Annual  Aerospace  Conference, 
February  2014. 

The  following  paper  has  been  submitted  to  Automatica.  It  is  currently  in  its  second  round  of 
review: 

•  Pantelis,  I.  and  Shima,  T.,  “Motion  Planning  Algorithms  for  the  Dubins  Travelling  Salesperson 
Problem”,  2014. 

The  following  paper  is  in  preparation  to  be  submitted  to  the  journal  Sensors  as  an  invited 
feature  paper: 

•  Gottlieb,  Y.  and  Shima,  T.,  “UAVs  Task  and  Motion  Planning  in  the  Presence  of  Obstacles 
and  Prioritized  Targets”,  2014. 
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Introduction 


This  research  is  concerned  with  intertwined  low-level  motion  planning  and  high-level  task  assign¬ 
ment  for  unmanned  aerial  vehicles  (UAVs).  As  a  case  study,  the  scenario  where  a  team  of  UAVs 
has  to  query  a  set  of  unattended  ground  sensors  (UGSs)  is  envisioned.  The  problem  can  be  ex¬ 
plained  using  the  following  example:  A  network  of  Unattended  Ground  Sensors  (UGS)  and  a  team 
of  unmanned  vehicles  are  used  to  prevent  intruders’  access  to  a  restricted  zone  (base  defense)  [45]. 
The  UGS  network  is  deployed  at  critical  road  junctions,  when  a  sensor  is  triggered  by  an  intruder, 
the  location  is  sent  as  a  target  to  be  visited  by  the  team  of  unmanned  vehicles.  In  case  there 
are  multiple  intrusions  at  different  times,  the  group  of  vehicles  must  be  allocated  according  to  the 
vehicles’  response  time  (time  to  target)  and  the  target’s  priority,  which  can  be  based  on  the  order 
of  the  UGS  triggering  time  or  on  the  location  of  the  sensor.  The  target  (sensor)  priority  is  time 
dependent  since  as  time  passes,  the  intruder  may  advance  to  a  different  location  and  the  target 
relevance  decreases. 


1.1  Single  Vehicle  -  Multiple  Targets 

In  the  first  phase  of  our  research  we  have  concentrated  on  the  motion  planning  and  task  assignment 
aspects  of  the  problem  where  a  single  UAV  has  to  decide  on  the  query  ordering  of  the  multiple 
UGSs,  given  constraints  on  its  motion.  This  problem  was  formalized  as  the  Dubins  travelling 
salesman  problem  (TSP). 

The  Dubins  Travelling  Salesperson  Problem  (DTSP)  and  its  variants  [8,  12,  18,  21,  22,  24,  25] 
are  useful  abstractions  for  the  study  of  problems  related  to  motion  planning  and  task  assignment 
for  uninhabited  vehicles.  As  in  the  case  of  the  classic  Euclidean  Travelling  Salesperson  Problem 
(ETSP)  in  M2  [19],  the  sought  after  solution  to  the  DTSP  is  a  tour  of  minimum  length  that 
passes  through  every  city  (target),  however,  in  the  case  of  the  DTSP,  the  tour  is  required  to  be  a 
planar  C 1  regular  curve  whose  curvature  is  bounded  above  by  a  given  constant.  The  additional 
requirements  on  the  regularity  and  the  curvature  of  the  tour  have  a  fundamental  implication  on 
the  very  nature  of  the  problem.  Specifically,  the  ETSP  belongs  to  the  realm  of  combinatorial 
optimisation,  whereas  the  DTSP  does  not.  A  precise  formulation  of  the  DTSP  is  given  below, 
however  the  crux  of  the  matter  is  that,  in  the  case  of  the  DTSP,  even  if  the  order  of  the  targets 
is  given  and  fixed,  the  length  of  the  tour  depends  on  the  heading  of  the  Dubins  vehicle  when  it 
passes  through  each  target  (in  other  words,  the  slope  of  the  tour  at  each  target).  Therefore,  the 
solution  space  for  the  DTSP  has  the  cardinality  of  the  continuum.  1A  possible  remedy  is  to  directly 
discretise  the  problem;  that  is,  for  each  target,  to  consider  h  candidate  headings  that  partition 
the  interval  [0,  27t)  uniformly  into  h  subintervals.  The  resulting  discrete  DTSP  (DDTSP)  can  be 
cast  as  an  integer  linear  program  (ILP)  that  can  be  solved  using  existing,  optimised  solvers  and  is 

1  Although  the  solution  space  consists  of  closed,  planar  curves ,  to  each  n-tuple  (0i,  ■ . . ,  9n)  £  S"  of  headings, 

where  n  is  the  number  of  cities,  corresponds  a  finite  number  of  tours  of  the  same  minimum  length. 
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amenable  to  analytic  tools  from  the  theory  of  ILP.  The  discretisation  approach  has  been  employed 
in  [7]  to  prove  tight  upper  bounds  for  the  optimal  solution  (to  the  DTSP)  that  depend  only  on  the 
number  of  targets  and  the  norm  of  the  partition  of  [0,  2ti).  The  bounds  are  tight  in  the  sense  that 
they  can  be  achieved  by  instances  of  the  DTSP  chosen  by  an  adversary.  On  the  other  hand,  the 
DDTSP  is  equivalent  to  a  generalised  asymmetric  travelling  salesperson  problem  (GATSP)  [13] 
with  n  clusters  and  1  +  h(n  —  1)  cities2  and  it  is,  therefore,  of  interest  to  keep  the  value  of  h  small. 
However,  the  minimum-time  function  for  the  Dubins  dynamics  is  discontinuous  (in  other  words, 
an  infinitesimal  change  of  the  value  of  a  heading  can  cause  the  overall  length  of  the  tour  to  jump  to 
a  higher  value)  and,  hence,  a  large  value  of  h  is  necessary  if  an  optimal  solution  to  the  DTSP  is  to 
be  approximated.  One  way  to  address  this  issue  is  to  observe  that  the  DTSP  lends  itself  naturally 
to  the  application  of  a  receding  horizon  principle.  Such  an  approach  is  taken  in  [17].  The  fc-step 
look-ahead  (fc-step  LAA)  algorithm  in  Section  2.3  also  relies  on  a  receding  horizon  principle;  the 
essential  difference  between  the  fc-step  LAA  and  the  algorithm  in  [17]  is  that,  in  [17],  the  order  of 
the  targets  is  considered  given  and  fixed,  whereas  in  the  present  paper  it  is  one  of  the  unknowns 
of  the  problem.  With  regard  to  the  DDTSP,  it  is  shown  in  Sections  2.3  and  2.5  that  the  fc-step 
LAA  can  reduce  the  search  for  a  feasible  DTSP  tour  to  a  combinatorial  problem  that  is  smaller 
in  size  than  the  DDTSP3  (depending  on  the  choices  of  h  and  k ),  while  achieving  a  substantial 
improvement  (by  50%,  in  certain  cases)  over  algorithms  that  rely  on  a  solution  to  the  ETSP  to 
find  a  feasible  tour  for  the  DTSP  [22], 

To  address  the  problem  of  solving  instances  of  the  DTSP  that  are  too  large  for  the  k- step 
LAA  to  be  practical,  a  local  improvement  algorithm  is  presented  in  Section  2.4.  The  idea  behind 
the  algorithm  is  borrowed  from  the  literature  on  the  ETSP  where  it  is  known  as  the  2-Opt  algo¬ 
rithm  [15].  Hence,  we  call  the  algorithm  of  Section  2.4  the  “2-Opt  fc-step  look-ahead  algorithm” 
(2-Opt  fc-step  LAA)  and  establish  an  upper  bound  on  the  length  of  DTSP  tours  it  generates,  under 
a  non-restrictive  assumption  on  the  initial  tour.  When  the  order  of  the  targets  is  fixed  and  the 
k- step  LAA  is  only  applied  to  find  an  admissible  DTSP  tour,  we  use  the  name  fc-step  ETSP-LAA 
to  distinguish  such  cases  from  those  where  the  order  of  the  targets  is  also  unknown. 

Once  an  assignment  of  a  heading  to  each  target  has  been  made,  several  existing  algorithms — 
e.g.,  approximation  algorithms  for  the  Asymmetric  Travelling  Salesperson  Problem  (ATSP) — 
can  be  directly  applied  to  find  a  feasible  tour  for  the  DTSP.  For  the  purpose  of  illustration  and 
comparison,  the  fc-step  LAA  is  combined  with  Dijkstra’s  algorithm  to  generate  the  simulations  in 
Section  2.5.  Given  the  sheer  size  of  the  combinatorial  problem  to  be  solved  and  that  Dijkstra’s 
algorithm  is  a  greedy  algorithm  that  returns  an  optimal  solution  (a  shortest  path  on  a  graph),  this 
combination  limits  the  applicability  of  the  overall  algorithm  to  cases  with  a  small  number  of  cities 
(these  statements  are  quantified  in  later  sections)  and  leaves  open  for  further  investigation  the 
fusion  of  the  fc-step  LAA  with  other  algorithms.  Nevertheless,  there  are  application  areas  where  a 
small  number  of  targets  does  not  represent  an  unrealistic  scenario. 

1.2  Multiple  Vehicles  -  Multiple  Targets 

In  the  second  phase  of  the  research  we  have  paid  special  attention  to  the  coupled  motion  planning 
and  task  assignment  problem  for  groups  of  UAVs  and  UGSs.  The  specific  problem  considered 
includes  allocating  the  group  of  UAVs  to  a  given  set  of  UGSs  while  accounting  for  the  vehicles 

2 The  city  of  origin  has  its  heading  fixed  to  the  initial  heading  of  the  vehicle.  To  every  other  city  corresponds  a 
cluster  with  h  nodes — as  many  as  the  candidate  headings. 

3  At  the  cost  of  computing  the  heading  at  each  city,  rather  than  assigning  it  arbitrarily. 


kinematic  constraints  and  avoiding  collision  with  obstacles  scattered  in  the  vehicles’  environment. 

The  task  assignment  problem  is  usually  coupled  with  the  motion  planning  problem  as  the  as¬ 
signments  allocation  process  depends  on  the  path  length,  and  the  path  length  depends  on  the 
vehicle’s  assignments.  This  coupling  issue  is  addressed  in  the  unmanned  vehicles  cooperative  mul¬ 
tiple  task  assignment  problem  (CMTAP)[26].  The  CMTAP  includes  a  scenario  in  which  multiple 
unmanned  vehicles  perform  multiple  tasks  on  stationary  targets.  Different  approaches  based  on 
customized  combinatorial  optimization  methods  were  employed  to  solve  this  problem,  including 
the  mixed  integer  linear  programming  (MILP)  [27,  28],  the  capacitated  transhipment  network 
solver  [29,  30],  genetic  algorithms  [31,  32,  26]  and  tree  search  methods  [33,  34],  The  presented 
works  account  for  the  vehicles’  constraints  but  they  simplify  the  problem  by  assuming  that  the 
environment  is  obstacle  free.  Most  of  the  studies  which  take  into  account  obstacles,  address  only 
the  motion  planning  subproblem  between  an  initial  and  final  configuration.  They  include  methods 
such  as  the  RRTs  method  [35],  probabilistic  roadmaps  [36]  and  kinodynamic  method  [37]. 

One  of  the  main  properties  of  the  problem  stated  above  is  the  assumption  that  the  targets 
have  the  same  characteristics  and  differ  only  in  their  position.  In  many  scenarios,  each  target  in 
the  targets  set  has  unique  attributes  which  include  different  importance  and  priority.  The  targets’ 
priority  may  also  vary  in  time  depending  on  the  specific  scenario.  Cases  in  which  targets  are 
assigned  with  a  priority  value  were  studied  in  [42,  43,  44],  The  problems  presented  in  these  studies 
include  a  task  allocation  process  that  maximizes  the  service  provided  to  each  target  based  on  the 
target’s  parameters  and  the  vehicles’  capabilities  (fuel,  payload).  The  targets  priority  is  addressed 
by  using  an  objective  function  which  includes  a  constant  parameter  describing  the  priority  value. 
In  these  problems  the  vehicles’  constraints  are  not  taken  into  account  and  the  environment  is 
assumed  to  be  free  of  obstacles,  which  may  lead  to  infeasible  vehicles’  trajectories. 
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Single  UAV  scenarios 


In  the  first  phase  of  our  research  we  have  investigated  the  problem  where  a  single  UAV,  modeled 
as  a  Dubins  vehicle,  has  to  decide  on  the  query  ordering  of  the  multiple  UGSs.  This  problem  is 
formalized  next  as  the  Dubins  travelling  salesman  problem  (TSP). 

2.1  Problem  Formulation 

Consider  the  control-affine  system 

Y(t)  =  *(7(t  ))  +  u(t)r(7(t))  (£) 

on  M  =  R2  x  S1,  where  X,  Y  G  T^TM  are  the  real  analytic  vector  fields  with  coordinate  repre¬ 
sentations1 

X (x,  y,  9)  =  (cos  9 ,  sin  9,  0)  and  Y (x,  y,  9)  =  (0,  0, 1) . 

The  admissible  controls  are  the  locally  integrable  maps  «:lD  /  U4  u(t)  G  [—1/p,  1/p]  C  R, 
where  p  is  a  fixed,  positive,  real  number,  referred  to  as  the  minimum  turn  radius.  Given  a 
control  u,  the  corresponding  locally  absolutely  continuous  trajectory  of  (£)  is  denoted  by  7.  The 
projection  on  R2  of  a  trajectory  7  will  be  called  the  path  that  corresponds  to  7.  An  optimal 
path  is  the  projection  of  an  optimal  trajectory.  A  trajectory  7  :  R  D  [a,  b]  — y  M  of  (£)  is  said 
to  be  closed  if  7 (a)  =  7 (b)  and  the  path  that  corresponds  to  a  closed  trajectory  is  called  a  tour. 
If  /  is  a  subinterval  of  R,  the  space  of  locally  absolutely  continuous  curves  in  M  defined  on  /  is 
denoted  by  Hj];/  (/;  M). 

The  control  system  (£)  can  be  viewed  as  the  kinematic  model  of  a  point  that  moves  with 
constant,  unit  speed,  along  a  planar  curve  whose  curvature  is  bounded  above  by  1/p.  We  shall 
abide  by  the  common  convention  of  referring  to  (£)  as  the  “Dubins  vehicle”  and  to  the  following 
minimum-time  control  problem  as  the  Dubins  Travelling  Salesperson  Problem  (DTSP). 

DTSP:  Let  n  be  a  positive  integer.  Given  a  point  p  G  M  and  n  submanifolds  ( targets )  of 
the  form  Nt  =  {(xj,jp)}  x  S1  C  M,  where  {x^yf)  G  R2  and  i  G  {l,...,n},  minimise  the  time 
T  >  0  over  the  set  T^p,  Aq , . . . ,  Nn)  of  closed  trajectories  7  G  IV/;/  ([(),  T];  M)  of  (£)  that  satisfy 
7(0)  =  7 (T)  =  p  and  Ini7  C\  Nt  0;  for  every  i  G  {1, . . . ,  n}. 

In  view  of  the  interpretation  of  (£)  as  a  kinematic  model  for  planar  motion,  the  targets  Nt, 
above,  correspond  to  fixed  points  in  R2.  Hence,  to  lessen  the  notational  burden  in  the  remainder  of 
the  paper,  we  refer  to  targets  simply  as  points  (xi,yf)  in  the  plane  and  retain  the  precise  notation 
{(xi,yi)}  x  S1  whenever  mathematical  consistency  is  warranted.  In  the  same  vein,  the  DTSP  can 

1In  the  chart  on  TM  induced  by  the  chart  (U,  f)  =  (R2  x  S1  \  {(—1,  0)},  {x,  y,  v,  w)  i-X  (a;,  y,6  =  a,tan{w/v)))  on 

M.  To  cover  the  entire  state  space  M,  a  second  chart  can  be  chosen  in  an  obvious  manner. 


10 


be  stated  less  formally  as  follows.  Given  an  initial  condition  p  and  n  points  in  the  plane,  find 
the  shortest  path  for  a  Dnbins  vehicle  that  starts  from  p,  passes  through  every  given  point,  and 
returns  to  p.  In  our  formulation  of  the  DTSP,  not  only  the  initial  position  of  the  Dubins  vehicle  is 
assumed  given,  but  also  the  initial  orientation.  It  is,  of  course,  possible  to  also  consider  the  initial 
orientation  of  the  Dubins  vehicle  as  an  independent  variable  of  the  optimisation  problem.  In  [22], 
for  example,  the  initial  orientation  is  chosen  freely.  This  minor  discrepancy  leads  to  minor  only 
modifications  for  any  given  algorithm  to  be  applicable  to  either  one  formulation  of  the  DTSP.  To 
avoid  trivial  complications  with  the  statements  and  the  notation  that  follow,  we  always  assume 
that  the  targets  are  distinct  from  each  other  and  the  initial  condition  p  does  not  lie  in  any  target. 
Lastly,  a  tour  corresponding  to  a  trajectory  7  £  T^(p,Ni, . . .  ,Nn)  will  be  called  an  admissible 
tour. 


2.2  Dubins  paths 

An  essential  component  of  the  algorithms  presented  in  later  sections  is  the  computation  of  minimum¬ 
time  paths  for  the  Dubins  vehicle.  We,  therefore,  recall  the  classification  by  Dubins  [11]  of  such 
paths. 

Let  denote  a  circular  arc  of  <f  radians  and  of  radius  p  in  the  plane,  and  Sd  a  straight-line 
segment  of  length  d  in  the  plane.  A  C1  concatenation  of  such  arcs  and  straight-line  segments  is 
denoted  by  juxtaposition  of  the  corresponding  symbols.  For  example,  CaSdCp  denotes  a  C 1  curve 
that  consists  of  an  arc  of  a  radians,  followed  by  a  straight-line  segment  of  length  d,  followed  by  an 
arc  of  j3  radians.  Consider,  now,  the  Dubins  Problem  (DP)  which  is  the  following  minimum-time 
problem. 

DP:  Given  two  points  p,q  £  M,  minimise  the  time  T  >  0  over  the  set  of  trajectories  7  £ 
W^([0,  T\;  M )  of  (£)  such  that  7(0)  =  p  and  7 (T)  =  q. 

The  following  theorem  is  proven  in  [5,  11,  23]. 

Theorem  1.  A  solution  to  the  Dubins  Problem  exists  and  an  optimal  path  has  to  be  either  of  the 
form  CaCpCs  or  of  the  form  CaSdCp,  where  0  <  a,  5  <  2tt,  tt  <  (3  <  2i t,  and  d  >  0. 

Moreover,  it  is  shown  in  [23]  that,  if  CaCpC&  is  an  optimal  path,  then  min{a,<5}  <  (3  — 
tt  and  max{a,  5}  <  (3.  A  path  that  corresponds  to  a  solution  to  the  DP  is  called  a  Dubins  path. 
Starting  from  Theorem  1,  it  can  be  shown  that  a  solution  to  the  DTSP  exists  [3,  Theorem  3.5.1]  [10, 
Proposition  2.3,  Theorem  3.1].  On  the  other  hand,  since  solutions  to  the  DP  are  not  unique  in 
the  sense  that,  in  general,  the  minimum  may  be  attained  by  more  than  one  trajectory,  solutions 
to  the  DTSP  are  not  unique,  a  fortiori  (an  example  is  given  in  [14]). 

The  formulation  of  the  DTSP  as  a  minimum-time  control  problem  hints  at  possible  algorithms 
for  the  computation  of  approximate  solutions.  The  description  of  such  an  algorithm  is  the  content 
of  the  next  section. 


2.3  The  k- step  look-ahead  algorithm 

The  motivation  behind  the  /c-step  LAA  is  to  replace  a  uniform  (in  [0,  27t))  or  random  assignment  of 
headings  to  each  target  with  a  judicious  choice  of  headings.  This  is  accomplished  by  employing  a 
receding  horizon  principle  that  allows  the  motion  planning  to  be  integrated  with  the  combinatorial 


11 


aspect  of  the  DTSP.  We  turn,  now,  to  a  concrete  description  of  these  ideas.  For  future  reference, 
let  us  call  the  problem  of  finding  a  minimum-time  trajectory  of  (£)  through  k  targets  the  “fc-step 
look-ahead  Dubins  problem”  (/.’-step  LADP).  A  precise  formulation  is  as  follows. 

k-step  LADP:  Let  k  be  a  positive  integer.  Given  a  point  p  G  M  and  an  ordered  k-tuple 
(Ni,...,Nk)  of  distinct  submanifolds  (targets)  of  the  form  A,  =  {(xj,r/j)}  x  S1  C  M,  where 
(. Xi,yi )  G  IR2  and  i  G  {l,...,fc};  minimise  the  time  T  >  0  over  the  set  of  trajectories  7  G 
W^([0,T]-,M)  of  (S)  such  that  7(0)  =  p,  7 (T)  G  Nk,  and  Ini7  D  Aj  7^  0,  for  every  i  G 

In  words,  we  seek  a  minimum-time  path  that  starts  from  p,  passes  through  every  target  in  the 
given  order,  and  ends  at  the  last  target.  A  few  comments  about  the  fc-step  LADP  are  in  order.  Let 
7  denote  a  solution  to  the  fc-step  LADP.  Let,  also,  tj  denote  the  first  time  instant  when  7 (tf)  G  Aj, 
where  i  —  1, . . . ,  k  and  tk  =  T.  First,  it  should  be  noted  that  the  restriction  f\[tk_1.tk\,  that  is,  the 
part  of  the  optimal  trajectory  between  the  last  two  targets  can  be  computed  much  more  efficiently 
if,  instead  of  Theorem  1  that  classifies  the  minimum-time  paths  between  two  given  states,  we  use 
the  following  result  of  [4]  that  classifies  the  Dubins  paths  between  a  given  initial  state  and  a  given 
final  position.  It  is  an  immediate  consequence  of  the  transversality  condition  at  Nk  [20]. 

Lemma  1.  A  path  that  corresponds  to  a  solution  to  the  1-step  LADP  is  either  of  the  form  CaCp 
or  of  the  form  CaSd,  where  0  <  a  <  2tt,  tt  <  (3  <  2i r,  and  d  >  0. 

According  to  the  terminology  used  thus  far,  Lemma  1  classifies  the  solutions  to  the  1-step 
LADP  (whose  corresponding  paths  are  also  known  as  relaxed  Dubins  paths)  and  waives  the 
need  to  check  all  possible  headings  at  the  final  target  when  an  optimal  solution  to  the  fc-step  LADP 
is  computed  numerically. 

Second,  given  an  instance  of  the  DTSP  with  n  targets,  a  solution  to  the  fc-step  LADP  with  k 
—  n  +  1  and  the  final  condition  q(T)  =  p  substituted  for  the  condition  7 (T)  G  Nk  is  a  globally 
optimal  solution  to  the  DTSP.  Clearly,  globally  optimal  trajectories  are  only  relevant  when  one 
seeks  to  comprehend  the  structure  of  the  solutions  to  the  DTSP  by  considering  instances  with  a 
few  targets  only. 

Third,  the  philosophy  behind  the  fc-step  LADP — and  the  advantage  of  the  resulting  fc-step  look¬ 
ahead  algorithm — is  that,  even  if  k  is  taken  to  be  small  (e.g.,  1  or  2)  relative  to  n  (the  number 
of  targets  in  a  given  instance  of  the  DTSP),  satisfactory  admissible  tours  can  still  be  obtained  by 
iteratively  solving  a,  perhaps  large,  number  of  computationally  tractable  problems. 

A  formal  description  of  the  k- step  LAA  is  somewhat  involved  and  the  bookkeeping  alone 
obscures  the  main  idea  behind  the  algorithm.  We  opt,  instead,  for  a  description  of  the  case 
k  —  2  by  means  of  a  representative  example  that  is  simple  enough  to  keep  the  presentation  clear, 
but  also  contains  the  essential  ideas  so  that  the  extension  to  the  general  case  (k  >  2)  becomes 
straightforward. 

Suppose  an  instance  of  the  DTSP  with  n  =  4  targets  is  given  and  the  algorithm  to  be  applied 
is  the  2-step  LAA.  The  first  step  is  to  construct  a  rooted  tree  whose  root  R  represents  the  initial 
condition  p  E  M  x  S1  of  the  Dubins  vehicle  and  induces  an  orientation  on  the  tree  away  from 
the  root.  The  children  of  the  root  represent,  temporarily,  the  four  possible  targets.  This  is  shown 
in  Figure  2.1.  The  representation  is  “temporary”  because,  eventually,  every  node  of  the  tree  will 
represent  a  target  together  with  a  heading  assigned  to  it,  not  simply  a  target.  Consequently, 
different  nodes  may  correspond  to  different  headings  at  the  same  “physical”  target.  Next,  we 
assign  headings  to  each  one  of  the  children/targets.  To  this  end,  each  child  of  R  is  replicated  as 
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Figure  2.1:  The  root  R  of  the  tree  represents  the  initial  condition  of  the  Dubins  vehicle  and  the 
children  of  the  root  represent  four  targets  before  headings  are  assigned  to  them. 


Figure  2.2:  Each  child  of  the  root  in  Figure  2.1  is  duplicated  as  many  times  as  the  number  of 
possible  subsequent  targets.  The  labels  on  the  nodes  represent  the  order  in  which  the  targets  are 
visited  and  each  node  represents  a  state  of  the  Dubins  vehicle,  that  is,  a  location  and  a  heading. 


many  times  as  the  number  of  possible  subsequent  targets.  For  example,  after  visiting  target  A, 
there  are  three  options:  to  visit  either  target  B  or  C  or  D.  Therefore,  the  root  has  A|  =  ||  =  12 
children. 2 This  process  leads  to  the  tree  in  Figure  2.2  and  is  a  consequence  of  setting  k  =  2  because, 
now,  each  node  XT',  where  X,  Y  G  {A,  B,  C,  D}  and  X  ^  Y,  can  be  used  to  represent  the  target 
X  with  the  heading  assigned  to  it  by  solving  the  2-step  LADP  with  initial  condition  p  (represented 
by  the  root  R),  first  target  X,  and  second  target  Y.  In  the  notation  of  the  previous  sections,  if  7  is 
a  solution  to  such  a  2-step  LADP,  r  g]0,  T[  is  the  first  time  when  q(r)  G  X ,  and  (x,  y ,  6)  are  local 
coordinates,  then  the  child  XY  is  assigned  the  heading  9(t).  The  length  of  the  Dubins  path  that 
corresponds  to  7|[o,T]  is  assigned  as  weight  to  the  edge  that  connects  p  to  XY.  This  assignment 
of  headings  allows  us  to  view  the  node  XY  as  the  state  of  the  Dubins  vehicle  that  consists  of 
the  position  of  the  target  X  and  the  heading  6{t).  Because  the  heading  at  a  target  X  depends 
on  the  target  Y  that  is  visited  next,  the  grandchildren  of  the  root  p  are  not  arbitrary.  Rather,  a 
child  of  a  node  XY  has  to  be  of  the  form  XY Z  (i.e.,  the  first  two  letters  remain  the  same),  where 
X,Y,Z  G  {A,  B,C,D}.  Figure  2.3  illustrates  this  idea  which  is  a  design  choice:  we  could  allow 
the  children  of  the  node  XY  to  be  of  the  form  XW Z  with  W  not  necessarily  equal  to  Y,  however 
such  a  choice  would  vitiate  the  anticipative  nature  of  the  algorithm.  Constraining  the  children  of 
each  node  in  this  way  also  has  the  effect  of  reducing  the  total  number  of  nodes  in  the  resulting 
tree. 

2By  definition, 
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Figure  2.3:  The  rooted  tree  used  to  find  a  feasible  tour  for  a  DTSP  with  n  =  4  targets  and 
look-ahead  horizon  k  =  2. 


Having  assigned  a  heading  to  every  child  XY  of  p,  we  can  proceed  in  the  same  manner  and  compute 
the  weights  between  the  children  and  the  grandchildren  of  R.  The  weight  of  the  edge  between  two 
nodes  XY  and  XY Z  is  computed  by  solving  the  2-step  LADP  with  initial  condition  XY  (recall 
that  nodes  represent  states),  first  target  Y  and  second  target  Z.  By  repeating  this  procedure,  the 
tree  in  Figure  2.3  is  constructed.  To  conform  with  the  definition  of  a  rooted  tree,  the  dashed  part 
in  Figure  2.3  should  not  be  considered  as  being  formally  part  of  the  tree;  it  is  included  as  a  visual 
aid  to  the  description  of  the  algorithm.  Towards  the  lower  end  of  the  tree  an  off-by-one  issue  has 
to  be  resolved,  but  this  can  be  done  in  a  straightforward  manner.  Specifically,  the  heading  of  a 
node  (state)  with  label  XYZWR ,  that  is,  of  a  node  that  corresponds  to  the  last  target  before 
returning  the  Dubins  vehicle  to  its  initial  condition,  has  to  be  computed  by  solving  a  2-step  LAA 
with  the  final  condition  y(T)  =  p  substituted  for  the  condition  y(T)  e  Nk.  Similarly,  when  k  >  2, 
the  value  of  k  has  to  be  reduced  towards  the  final  stages  of  the  construction  of  the  tree,  simply 
because  the  look-ahead  horizon  of  the  algorithm  will  exceed  the  number  of  targets  that  are  left  to 
be  considered  (this  reduction  is,  of  course,  computationally  beneficial).  Let  the  tree  constructed  by 
the  fc-step  LAA  be  denoted  by  Gk,n  =  (14, n,  Ek,n)  (in  Figure  2.3,  k  =  2  and  n  —  4).  The  following 
proposition  gives  an  upper  bound  on  the  length  of  shortest  paths  from  the  root  R  to  the  terminal 
node  of  Gk,n- 

Proposition  1.  A  shortest  path  in  Gk,n  from  the  root  to  the  (ficticious)  terminal  node  represents 
a  DTSP  tour  of  length  at  most 

ETSP(n)  +  (n  +  1)/V7 vp, 

where  n  is  the  number  of  targets,  n  is  a  constant,  and  ETSP(n)  denotes  the  length  of  a  solution 
to  the  corresponding  ETSP. 

Proof.  Every  admissible  solution  to  the  DTSP  is  a  concatenation  of  Dubins  paths  7 ;,  i  =  1, . . . ,  n+ 
1,  and  the  length  £( 7)  of  a  Dubins  path  7  between  two  targets  satisfies  £(7)  <  d  +  nnp,  where  d 
is  the  Euclidean  distance  between  the  targets  and  n  G  [2.657,2.658]  is  a  constant  [22,  Thm  3.4], 
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Moreover,  an  instance  of  the  DTSP  can  also  be  viewed  as  an  instance  of  the  ETSP  and  one  of  the 
paths  from  the  root  R  to  the  terminal  node  in  Gk,n  corresponds  to  the  optimal  order  Oetsp  f°r  the 
ETSP.  Therefore,  the  following  bound  holds  for  the  length  Lkp{n)  of  a  shortest  path  from  R  to  the 
terminal  node 

Lp(n )  =  ES1  Ki i)  <  ES1  ( di  +  ^ P )  (2 

=  ETSP(n)  +  (n  +  l)K7rp, 

where  ETSP(n)  denotes  the  length  of  the  solution  ctetsp-  □ 

Remark.  The  simulations  in  Section  2.5  provide  strong  evidence  that  the  bound  (2.1)  is  not 
always  sharp,  especially  when  the  average  intercity  distance  is  comparable  to  the  minimum  turn 
radius  of  the  Dubins  vehicle.  This  is  to  be  expected  because  the  derivation  of  (2.1)  does  not  take 
into  consideration  essential  features  of  the  fc-step  LAA  such  as  ordering  the  targets  independently 
of  the  solution  to  the  ETSP  and  the  use  of  a  receding  horizon  principle.  On  the  other  hand,  it  is 
obvious  that 

lim  Lk  (n)  =  ETSP(n). 

p—>0  p 

Once  the  tree  Gk,n  has  been  constructed  the  /c-step  LAA  reduces  to  a  shortest  path  algorithm. 
To  this  end,  existing  algorithms  can  be  adopted.  For  the  implementation  of  the  1-step  and  2-step 
LAA  used  to  generate  the  simulations  in  Section  2.5,  Dijkstra’s  algorithm  was  chosen  because  of 
its  simplicity  and  because  it  is  guaranteed  to  return  a  shortest  path  in  the  tree.  As  far  as  the 
resulting  time-complexity  of  the  fc-step  LAA  is  concerned,  a  first  observation  is  that 

imi  >  i + £"=.  5^,  (2.2) 

where  equality  holds  for  k  —  1  and  the  standard  convention  0!  =  1  is  assumed.  In  the  case  k  =  2 
that  we  use  as  our  running  example,  we  have 

\Vz, n\  =  1  +  n\  +  EIL2  ppiqT  =  |hi,n|  +n\  —  n.  (2.3) 

Remark.  If  we  construct  a  tree  G2,n  =  (bgn,  E2tTl)  that  represents  all  possible  DTSP  tours  in 
the  case  where  two  headings  are  assigned  to  each  target  either  randomly  or  by  partitioning  [0,  27t) 
into  two  subintervals,  then  \V2jn\  increases  faster  than  |R2,n|  with  respect  to  n.  The  following  table 
shows  the  values  of  |V2,n|  and  |V2,n|  as  the  size  n  of  an  instance  of  the  DTSP  increases. 


n 

3 

4 

5 

6 

7 

8 

9 

10 

\V2,n\ 

16 

65 

326 

1957 

13700 

109601 

986410 

9864101 

\V2,n\ 

19 

61 

201 

871 

5293 

40713 

363457 

3629611 

If  the  nrin-priority  queue  used  by  Dijkstra’s  algorithm  is  implemented  as  a  Fibonacci  heap, 
then  the  time-complexity  of  searching  in  Gk,n  is  [9] 

0(|I4,n|log|I4,n|  +  \EKn\)  =  0(|I4,n|  log  |Pfc,n|  +  \Vk,n\),  (2.4) 

where  we  used  the  fact  that  \Ek)Tl\  =  |I4,n|  —  1,  because  Gk,n  is  a  tree,  and  discarded  the  additive 
constant.  Additionally,  the  weight  of  every  edge  in  Ektn  is  computed  by  solving  a  /c-step  LADP. 
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Given  that  Dubins  paths  can  be  computed  in  constant  time  0(1),  if  we  partition  [0,  2i r)  uniformly 
into  h  subintervals,  then  the  time-complexity  for  a  numerical  approximation  of  a  solution  to  a  fc-step 
LADP  is  0(khk ~1),  if  we  ignore  additions  and  finding  minima  among  lists  of  numbers.  However, 
a  useful  observation  is  that  the  tree  Gk,n  does  not  have  to  be  constructed  in  its  entirety  and  then 
searched;  instead,  only  the  propagating  front  of  Dijkstra’s  algorithm  needs  to  be  stored.  This  idea, 
combined  with  an  upper  bound  for  the  DTSP  tour  can  reduce  the  number  of  computations  and 
amount  of  memory  required.  The  above  discussion  is  summarised  in  the  following  proposition  for 
the  case  k  =  2  in  which  simple  explicit  formulae  like  (2.3)  can  be  obtained.  Moreover,  in  practice, 
the  2-step  LAA  seems  to  achieve  a  good  balance  between  computation  time  and  the  size  of  the 
resulting  DTSP  tour. 

Proposition  2.  The  worst-case  time- complexity  of  the  2-step  look-ahead  algorithm  is 

O  (|V2,n|  log  |V2,n|  +  (4 h  +  l)|y2,n|  +  n!) ,  (2.5) 

where  h  is  the  number  of  subintervals  into  which  [0,  27r)  is  partitioned  when  approximating  a  solution 
to  the  2-step  LADP. 

Proof.  Assuming  that  the  whole  tree  G2,n  has  to  be  searched  until  a  shortest  path  is  found,  then, 
as  explained  above,  the  running  time  for  the  search  procedure  is 

0(\V2,n\log\V2,n\  +  \V2,n\). 

Assigning  a  weight  to  every  edge  of  G2,n  requires 

0((2h  +  h  +  h)\V2,n\)  =  0(4h\V2,n\)  (2.6) 

operations  and  an  additional  0(n\)  to  compute  the  weights  of  the  dashed  edges  in  Figure  2.3  that 
correspond  to  the  final  Dubins  paths  that  complete  the  DTSP  tours.  The  term  (2 h  +  h  +  h)  in 
equation  (2.6)  comes  from  the  fact  that  we  consider  h  headings  in  [0,  27r)  and,  for  each  heading, 
we  have  to  compute  two  Dubins  paths  (one  path  that  connects  the  initial  condition  of  the  Dubins 
vehicle  to  the  first  target  and  one  path  that  connects  the  first  to  the  second  target;  see  the 
formulation  of  the  fc-step  LADP),  sum  their  lengths,  and  find  the  minimum  among  the  h  sums  of 
lengths.  That  is,  there  are  2 h  computations  of  Dubins  paths,  h  additions,  and  finding  a  minimum 
among  h  numbers  (which  takes  0(h)  [9]).  Combining  everything  together  gives  (2.5).  □ 

Similar  analyses  can  be  carried  out  for  different  values  of  the  look-ahead  horizon  k,  however 
the  calculations  need  to  account  for  a  varying  k  and,  hence,  are  more  involved. 

Once  headings  have  been  assigned  to  the  targets  using  the  receding  horizon  principle  behind 
the  fc-step  LAA,  another  possibility  for  obtaining  an  admissible  DTSP  tour  is  to  view  the  vertices 
in  Vk  as  a  GATSP  with  n  clusters.  Each  cluster  consisting  of  the  vertices  in  Vk  that  represent 
different  headings  at  the  same  target.  The  resulting  GATSP  can  be  subsequently  transformed  to 
an  ATSP  and  solved  by  means  of  an  approximation  algorithm  such  as  the  algorithm  in  [16,  Thm 
4.1].  When  an  approximation  algorithm  for  the  ATSP  is  incorporated  as  part  of  the  fc-step  LAA, 
the  solution  space  is  different  (larger)  than  in  the  case  where  a  shortest  path  algorithm  in  Gk,n  is 
used,  unless  the  constraint  that  a  child  of  a  node  of  the  form  XY  be  of  the  form  XYZ  is  imposed, 
as  explained  in  Section  2.3. 

We  conclude  this  section  with  one  final  remark  on  the  applicability  of  the  k- step  LAA  to  large 
instances  of  the  DTSP.  One  possibility  is  to  create  small  clusters  of  targets  according  to  a  suitable 
criterion,  apply  the  fc-step  LAA  to  each  cluster,  and  combine  the  resulting  tours  into  a  single  tour. 
This  method  is  facilitated  if  the  targets  naturally  form  spatially  separated  clusters. 
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Figure  2.4:  A  schematic  representation  of  a  2-0pt  move. 


2.4  A  local  improvement  algorithm 

The  goal  of  this  section  is  to  describe  an  algorithm  that  can  be  applied  to  instances  of  the  DTSP 
that  are  too  large  for  the  fc-step  LAA  to  be  practical.  The  motivation  behind  the  algorithm  is 
twofold.  First,  to  leverage  one  of  the  many  ideas  that  have  been  successfully  applied  to  the  ETSP 
and,  second,  to  exploit  the  fact  that,  if  the  order  of  the  targets  is  fixed,  then  an  admissible  DTSP 
tour  can  be  found  quickly.  For  a  thorough  description  of  the  ideas  that  follow,  in  their  original 
context  (the  TSP  and  the  ETSP),  and  related  results  the  reader  is  referred  to  [15,  Section  3]. 

A  class  of  algorithms  that  are  particularly  effective  for  the  ETSP  is  that  of  local  improvement 
algorithms,  also  known  as  exchange  heuristics.  These  are  algorithms  characterised  by  a  set  of 
exchanges  or  moves  that  reorder  the  targets.  A  local  improvement  algorithm  attempts  to  decrease 
the  length  of  a  given  tour  by  performing  a  sequence  of  admissible  moves.  If  a  move  results  in 
a  shorter  tour,  then  the  new  tour  is  stored  as  the  current  optimum,  otherwise  the  new  tour  is 
discarded.  A  sequence  of  moves  is  performed  until  either  there  there  is  no  improvement  or  a 
predetermined  maximum  number  of  moves  is  reached.  The  most  widely  known  local  improvement 
algorithms  are  the  k- Opt  algorithms.  The  idea  behind  them  is  to  remove  k  edges  from  a  given 
tour,  thus  breaking  the  tour  into  separate  paths,  and  reconnect  these  paths  in  different  ways.  For 
example,  if  ACBDA  is  the  initial  tour  in  Figure  2.4,  then  an  example  of  a  2-Opt  move  performed 
by  a  2-Opt  algorithm  is  the  replacement  of  edges  (A,  C)  and  (B,D)  by  edges  (A,  B)  and  (C,D). 
In  this  contrived  example,  a  single  move  leads  to  the  optimal  tour. 

Analysing  the  performance  of  k- Opt  algorithms  is  a  non-trivial  matter  (e.g.,  determining  after 
how  many  moves  they  halt,  their  approximation  ratio,  or  if  they  are  guaranteed  to  improve  a  given 
initial  tour)  and  in  practice  they  perform  better  than  what  the  theoretical  analysis  suggests  [15]. 
One  reason  behind  the  discrepancy  between  the  theoretical  bounds  and  the  actual  performance 
of  the  algorithms  is  that,  in  deriving  worst-case  bounds,  it  is  assumed  that  an  adversary  chooses 
the  initial  tour  or  that  the  adversary  also  chooses  the  moves  performed  by  the  algorithm.  The 
situation  is  even  more  complicated  in  the  case  of  the  DTSP  because  much  less  is  known  analytically 
about  the  properties  of  optimal  solutions.  For  example,  an  uncrossing  move  like  the  one  shown 
in  Figure  2.4)  cannot  increase  the  length  of  an  ETSP  tour  and  it  is  known  that  all  crossings  can 
be  eliminated  from  a  tour  by  applying  at  most  n3  2-Opt  moves  that  remove  crossings.  Although 
there  are  no  analogous  results  for  the  DTSP,  the  main  idea  behind  the  k- Opt  algorithms  can  be 
combined  with  randomisation  to  obtain  an  algorithm  for  the  DTSP  that  performs  well  in  practice. 
For  concreteness  and  simplicity  we  consider  only  the  combination  of  a  2-Opt  move  with  the  k- step 
ETSP-LAA.  A  2-Opt  move  means  that,  given  a  DTSP  tour,  we  choose  two  pairs  of  consecutive 
cities  (A,  B)  and  (C,  D)  on  the  tour  and  alter  the  tour  so  that  C  is  visited  after  A  and  D  is  visited 
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2-OPT  /-STEP  LOOK-AHEAD  ALGORITHM 

An  instance  of  the  DTSP  and  a  positive  integer 

f  max 

An  admissible  DTSP  tour 

/max  t—  maximum  number  of  moves 

Find  an  initial  order  <to  for  the  targets. 

Use  (To  to  apply  the  /-step  ETSP-LAA  and  obtain 
an  initial  DTSP  tour  of  length  Lo  and  a  vector 
Hq  containing  the  headings  at  the  targets, 
for  i  =  1  to  /max 

Apply  a  randomly  generated  2-Opt  move  to  <r0 
to  find  a  new  order  a. 

Use  a  to  apply  the  /-step  ETSP-LAA  and  ob¬ 
tain  a  DTSP  tour  of  length  L  and  a  vector  H 
containing  the  headings  at  the  targets, 
if  L  <  Lo 
o-0  <-  a 
H0^H 
Lq  t—  L 
end  if 
end  for 

return  <to,  Hq,  and  Lq 


Table  2.1:  The  2-Opt  k- step  look-ahead  algorithm 


after  B.  Having  no  easy  (systematic)  way  to  tell  a  priori  which  moves  will  not  increase  the  length 
of  a  tour,  the  2-Opt  moves  are  generated  randomly.  The  algorithm  terminates  when  a  predefined 
maximum  number  Jmax  of  moves  has  been  attempted.  The  corresponding  2-Opt  fc-step  look-ahead 
algorithm  is  described  in  Table  2.1.  To  obtain  an  upper  bound  for  the  output  of  the  2-Opt  k- 
step  LAA,  suppose  that  an  initial  order  for  the  targets  is  found  using  an  approximation  algorithm 
for  the  ETSP  (step  2  in  Table  2.1)  with  approximation  ratio  a.  In  principle,  such  an  algorithm 
could  be  Christofides’s  algorithm  [6]  which  has  approximation  ratio  3/2  or  obtained  from  Arora’s 
polynomial  time  approximation  scheme  for  the  ETSP  [2],  Then  we  have  the  following. 

Proposition  3.  Given  an  instance  of  the  DTSP  with  n  targets,  the  2-Opt  k-step  look-ahead  algo¬ 
rithm  returns  a  DTSP  tour  of  length  at  most 

a  ■  ETSP(n)  +  (n  +  l)nnp.  (2.7) 

Proof.  Given  an  initial  ETSP  tour  whose  length  is  a  times  the  length  of  an  optimal  tour,  equa¬ 
tion  (2.1)  implies  that  the  length  L0  of  the  corresponding  DTSP  tour  (step  3  in  Table  2.1)  satisfies 

L0  <  a  ■  ETSP(n)  +  (n  +  l)/?7rp. 

In  the  worst  case  scenario,  no  2-Opt  move  will  result  in  any  improvements  of  the  DTSP  tour  and 
(To,  H0,  and  L0  will  be  the  output  of  the  2-Opt  2-step  look-ahead  algorithm.  □ 

Similarly  to  the  case  of  the  /-step  LAA,  a  bound  that  relies  on  the  order  of  the  targets  that 
corresponds  to  a  solution  to  the  ETSP  (possibly  approximate  solution,  in  the  case  of  the  2-Opt 
/-step  LAA)  is  not  expected  to  be  sharp,  unless  the  minimum  turn  radius  of  the  Dubins  vehicle  is 
small  relative  to  the  average  intercity  distance.  On  the  other  hand,  the  right-hand  side  of  (2.1)  is 
susceptible  to  relatively  accurate  estimations  by  means  of  the  Hcld-Karp  bound  for  the  ETSP  [15, 
Section  2.3]. 
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2.5  Simulations 


2.5.1  The  2-step  look-ahead  algorithm 

The  fc-step  LAA  can  also  be  used  simply  as  a  receding  horizon  algorithm  on  a  sequence  of  n  targets 
(N^U  diat  have  been  ordered  by  some  other  method,  e.g.,  by  solving  an  ETSP.  Specifically, 
suppose  that  the  targets  have  been  reindexed  so  that  iVj  is  the  i-th  target.  Starting  from  the 
initial  condition  p,  a  solution  7  is  found  to  the  fc-step  LADP  that  corresponds  to  the  first  k 
targets.  Then,  only  the  part  7|[o,o]  that  connects  p  to  N\  is  kept  and  the  point  7(^1)  is  considered 
as  a  new  initial  condition  from  which  the  k- step  LADP  for  the  targets  N2  to  can  be  solved. 
This  procedure  is  repeated  until  an  admissible  tour  is  constructed  with  the  horizon  k  being  reduced 
as  necessary  when  less  than  k  targets  are  left.  Recall  that  we  called  this  method  for  finding  DTSP 
tours  the  “fc-step  ETSP-LAA”. 

In  this  section,  the  following  five  algorithms  are  compared  by  means  of  Monte  Carlo  simulations 
and  the  results  are  shown  in  Figures  2.5  to  2.7. 

1.  1-step  ETSP-LAA:  An  ETSP  is  solved  to  order  the  targets  and  the  look-ahead  horizon  is 
set  to  k  —  1. 

2.  2-step  ETSP-LAA:  An  ETSP  is  solved  to  order  the  targets  and  the  look-ahead  horizon  is 
set  to  k  —  2. 

3.  1-step  LAA:  The  A;-step  LAA  of  Section  2.3  for  k  —  1. 

4.  2-step  LAA:  The  fc-step  LAA  of  Section  2.3  for  k  —  2. 

5.  AA:  (alternating  algorithm)  This  is  the  algorithm  described  in  [22], 

Remark.  In  [17],  algorithms  (1)  and  (2),  above,  are  called  “two-point  algorithm”  and  “look-ahead 
algorithm” ,  respectively.  Note,  however,  that,  in  [17],  these  algorithms  are  applied  to  sets  of  targets 
that  are  ordered  randomly,  as  opposed  to  being  ordered  by  first  solving  the  corresponding  ETSP. 

For  each  number  of  targets  shown  in  the  x-axes  of  Figures  2.5,  2.6,  and  2.7,  100  instances  of 
the  DTSP  were  randomly  generated  using  the  implementation  of  the  Mersenne  twister  provided  by 
the  Boost  C+- 1-  Libraries  [1].  The  initial  condition  was  set  equal  to  p  —  (0,  0,7t/2)  and  the  targets 
were  contained  in  [— 2.5,2. 5]2  C  R2  with  uniform  distribution.  Next,  each  algorithm  was  applied 
to  all  randomly  generated  instances  and  the  length  of  each  DTSP  tour  was  normalised  (divided) 
by  the  length  of  the  solution  to  the  ETSP  for  the  same  set  of  targets  (hence,  the  normalised  length 
of  the  solution  to  the  ETSP  is  always  equal  to  1).  The  y- axes  correspond  to  the  average  of  these 
normalised  lengths.  The  three  figures  correspond  to  three  different  minimum  turning  radii.  As 
expected,  when  the  minimum  turn  radius  is  small  (i.e. ,  p  =  0.1  in  Figure  2.7)  relative  to  the 
distance  between  the  targets,  the  difference  between  the  output  of  the  five  algorithms  is  negligible 
(note  the  range  of  the  y-axis).  In  such  a  case,  execution  speed  becomes  the  dominant  factor  in 
choosing  an  algorithm  and  the  2-step  LAA  is  the  most  time-consuming  among  the  five  algorithms. 
In  all  cases,  however,  the  2-step  LAA  yields  the  shortest  DTSP  tours  as  the  number  of  targets 
increases. 

Figure  2.5  corresponds  to  minimum  turn  radius  p  =  1  and  can  be  used  to  quantify  our  earlier 
statement  that  the  bound  (2.1)  is  not  always  sharp.  We  observe  that,  for  the  generated  instances 
of  the  DTSP, 

L\{n)  <  1.7  •  ETSP(n),  n  =  3,...,  9,  (2.8) 
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Figure  2.5:  Comparison  of  five  algorithms  via  Monte  Carlo  simulations.  For  each  number  of  targets 
shown  in  the  horizontal  axis,  the  algorithms  were  applied  to  100  randomly  generated  instances  of 
the  DTSP.  For  a  fixed  minimum  turn  radius  equal  to  1  and  a  fixed  initial  condition  of  the  Dubins 
vehicle  (0,0,7t/2),  the  positions  of  the  targets  were  generated  with  uniform  distribution  in  the 
square  [—2.5,  2.5]2  C  M2.  The  length  of  every  DTSP  tour  was  normalised  (divided)  by  the  length 
of  the  ETSP  tour  for  the  same  set  of  targets  and  the  vertical  axis  corresponds  to  the  average 
normalised  lengths  of  the  DTSP  tours. 


on  the  average.  For  the  sake  of  comparison,  if  we  set  n  =  9  and  ETSP  (9)  equal  to  the  average 
value  ETSP (9)  =  13.387  of  the  length  of  the  100  ETSP  tours  with  9  targets  used  to  generate 
Figure  2.5,  then  the  right-hand  side  of  (2.1)  becomes  96.891,  whereas  the  right-hand  side  of  (2.8) 
becomes  22.758. 

In  Figure  2.8,  the  five  algorithms  are  compared  on  an  instance  of  the  DTSP  where  five  targets 
and  the  initial  condition  lie  on  a  circle  of  radius  r  =  1.1  and  the  minimum  turn  radius  p  is  equal 
to  1.  This  example  is  chosen  because  it  allows  a  comparison  between  the  output  of  each  algorithm 
and  the  optimal  DTSP  tour  which  is  approximately  a  circle  of  radius  r  (Fig.  2.8e).  Moreover, 
it  demonstrates  how  reliance  on  the  solution  to  the  ETSP  can  lead  to  DTSP  tours  that  are  far 
from  optimal.  Specifically,  the  optimal  ETSP  tour  is  a  polygon  inscribed  in  the  circle  of  radius  r, 
however,  if  the  targets  are  ordered  clockwise  (the  orientation  of  a  tour  is  immaterial  for  the  ETSP) 
the  fc-step  ETSP-LAA,  for  k  =  1,2,  and  the  alternating  algorithm  output  the  tours  shown  in  Fig. 
2.8a,  2.8b,  and  2.8c,  respectively. 

2.5.2  The  2-Opt  2-step  look-ahead  algorithm 

For  the  simulations  in  this  section,  the  2-Opt  fc-step  LAA  was  implemented  for  k  =  2  and  a 
simplified  version  of  the  2-Opt  algorithm  for  the  ETSP  was  used  to  find  an  initial  order  ay  for  the 
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Figure  2.6:  Same  as  Figure  2.5,  but  with  p  =  10. 
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Figure  2.7:  Same  as  Figure  2.5,  but  with  p  =  0.1.  The  scale  of  the  vetical  axis  shows  that,  when 
the  minimum  turn  radius  is  small  compared  to  the  average  distance  between  the  targets,  the  DTSP 
tours  approximate  the  solution  to  the  ETSP. 
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Figure  2.8:  Comparison  of  five  algorithms  on  an  instance  of  the  DTSP  with  the  5  targets  and  the 
initial  condition  lying  on  the  circle  x2  +  y2  —  r2,  r  —  1.1.  The  minimum  turn  radius  is  p  =  1  and 
the  initial  condition  of  the  Dubins  vehicle  is  p  =  (1,0,  7t/2).  The  DTSP  tour  generated  by  the 
2-step  LAA  practically  coincides  with  the  optimal  solution. 
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Approximate  solution  to  the  ETSP  Alternating  algorithm.  Total  length  =  3.5097  ETSP 


2-step  ETSP-LAA.  Total  length  =  3.6188  ETSP  2-Opt  2-step  LAA.  Total  length  =  1 .8959  ETSP 


(c)  (d) 


Figure  2.9:  Comparison  of  the  2-Opt  2-step  LAA,  the  2-step  ETSP-LAA,  and  the  alternating 
algorithm  on  a  single  instance  of  the  DTSP  with  20  targets.  Similarly  to  Figure  2.5,  the  minimum 
turn  radius  is  p  =  1,  the  initial  condition  of  the  Dubins  vehicle  is  p  =  (0,  0,  vr/2),  and  the  positions 
of  the  targets  are  randomly  and  uniformly  distributed  in  [— 2.5,2. 5]2  C  M2.  The  ETSP  tour  in  (a) 
is  found  using  a  simplified  version  of  the  2-Opt  algorithm  and  is  not  optimal.  In  (d)  the  maximum 
number  Jmax  of  2-Opt  moves  equals  1000. 


targets.  The  same  order  was  provided  as  input  to  the  2-step  ETSP-LAA  and  the  alternating 
algorithm.  Because  the  2-Opt  algorithm  is  not  guaranteed  to  find  a  global  minimum,  it  should 
be  noted  that  it  is  with  respect  to  this  approximate  solution  to  the  ETSP  that  the  lengths  were 
normalised.  Applying  the  2-Opt  2-step  LAA  to  an  instance  of  the  DTSP  with  n  =  20  and  Jmax  = 
1000  requires  less  than  five  minutes.  An  example  is  shown  in  Figure  2.9.  Besides  experimenting 
with  the  value  of  Jmax,  there  are  improvements  that  can  be  made  in  the  implementation  of  the  2-Opt 
k- step  LAA  to  reduce  the  execution  time.  First,  the  algorithm  in  Table  2.1  is  easily  parallelisable 
in  a  way  that  multiple  threads  can  be  executed  almost  independently.  Second,  when  a  2-Opt  move 
is  applied  to  <To  to  obtain  a  new  order  a  (step  5),  the  tours  that  correspond  to  Co  and  a  can  have 
a  large  part  in  common  and  the  new  tour  need  not  always  be  computed  anew.  In  general,  this  is 
the  case  if  the  2-Opt  move  alters  the  latter  part  of  cr0. 
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Mulitple  UAV  scenarios 


The  problem  considered  in  the  second  phase  of  the  research  includes  allocating  a  group  of  vehicles 
to  a  given  set  of  targets  while  taking  into  account  the  vehicles’  kinematic  constraints  and  avoiding 
collision  with  obstacles  scattered  in  the  environment.  Each  target  is  assigned  with  a  time  dependent 
value  (referred  to  as  the  target  benefit)  that  represents  the  target’s  importance  and  priority.  The 
objective  is  to  maximize  a  reward  function  which  is  the  sum  of  all  the  benefits  gathered  by  the 
group  of  vehicles. 

3.1  Problem  Formulation 

3.1.1  Vehicles 

Let  V  =  (Vj,  V‘2, ..,  Vnv}  be  a  set  of  unmanned  aerial  vehicles  which  need  to  complete  the  visit 
requirements  of  the  given  set  of  targets.  The  vehicles  have  a  minimum  turn  radius  and  can  move 
only  forward.  The  kinematic  constraints  need  to  be  accounted  for  when  planning  the  vehicles’ 
trajectory.  The  equations  of  motion  are  presented  below: 

Vehicle  kinematics 

x  —  v  cos  if} 
y  —  v  sin  ip 
ip  —  u 

Motion  constraints: 

only  forward  motion  is  allowed 

v  =  U 

Turn  rate  constraint  (Given  a  minimum  turn  radius) 

\u\^U/Rmin  (3.3) 

where  (x,y)  are  the  vehicle’s  Cartesian  coordinates  and  ip  is  the  vehicle’s  orientation  angle,  U  and 
to  are  the  vehicle’s  constant  speed  and  maximum  turn  rate  respectively. 


(3.2) 


(3.1) 
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3.1.2  Targets  and  Benefits 

Let  T  =  {7j ,  T2, TNt}  be  the  set  of  NT  stationary  targets  designated  for  the  group  of  vehicles.  Let 
C  =  {Ci,  C2, Cjvt}  be  the  set  of  initial  benefits  assigned  to  each  target  and  let  S  =  {1,  2, NT} 
be  the  set  of  stages  in  which  a  target  is  allocated  as  an  assignment  to  a  vehicle.  The  target’s 
benefit  represents  the  value  granted  to  a  vehicle  for  visiting  the  target.  Since  the  benefits  are  time 
dependent,  we  propose  a  mathematical  formulation  which  is  referred  to  as  the  ” benefit  function”. 
This  formulation  represents  the  reward  granted  to  a  vehicle  for  visiting  a  target,  depending  on  the 
target’s  priority  (represented  by  its  initial  benefits)  and  the  time  it  takes  the  vehicle  to  get  to  the 
target  from  its  initial  position. 

Let 

C  =  L?JV  (3.4) 

be  the  time  required  for  vehicle  i  G  V  to  travel  to  target  k  G  T  at  stage  m  G  S  from  its  current 
position.  L^,  V  are  the  path  length  taken  by  vehicle  i  to  visit  target  k  at  stage  m  and  the  vehicle 
constant  speed,  respectively.  Note  that  depends  on  the  assignment  history  of  vehicle  i  prior  to 
stage  m.  The  assignment  history  is  used  to  obtain  the  initial  condition  (position  and  orientation) 
required  to  calculate  the  vehicle’s  path.  Let  A  be  a  user  defined  coefficient  which  defines  the  benefit 
function’s  descent  rate.  Y^m=i  J2k= i  tfk  is  the  total  time  required  for  vehicle  i  to  travel  to  target 
k  from  its  initial  position  at  the  current  stage  l  .  We  use  the  descent  rate  coefficient  and  the  total 
time  passed  until  the  vehicle  visits  the  current  target  to  formulate  the  benefit  function  as  follows: 

Cje~A^-‘m=1  hfc  (3.5) 

This  formulation  helps  create  a  problem  in  which  the  vehicle  assignments’  order  depends  on  the 
path  to  each  target  and  not  only  on  the  target’s  initial  priority  (for  example,  the  highest  priority 
target  is  not  necessarily  visited  first  and  the  time  to  arrive  at  the  target’s  location  is  also  taken  into 
consideration).  In  addition,  the  same  formulation  can  be  used  to  describe  the  example  scenario 
that  includes  a  UGS  network  and  a  team  of  unmanned  vehicles  used  for  intruder  detection  and 
identification.  The  vehicles’  response  time  is  taken  into  account  by  calculating  the  vehicles’  path 
length,  and  the  targets’  different  initial  priority  represents  the  order  of  the  UGS  triggering  time. 
Since  the  time  it  takes  a  vehicle  to  reach  a  target  depends  on  the  vehicle’s  path  length,  the  latter 
will  be  calculated  using  a  motion  planning  subroutine,  described  in  section  3.2.  Figure  3.1a  shows 
the  benefit  function’s  change  over  time,  each  curve  begins  with  a  different  initial  value  (initial 
benefits  3  and  10).  The  benefit  function  is  a  monotonically  decreasing  function  and  as  such,  the 
initial  value  diminishes  as  time  progresses.  When  the  descent  rate  coefficient  is  changed  (increased 
by  5  times),  the  benefit  rapidly  diminishes  over  time,  as  seen  in  Figure  3.1b.  The  increase  of  the 
descent  rate  may  also  cause  a  change  in  the  targets  assigned  to  each  vehicle  or  a  different  order  in 
which  the  assigned  targets  need  to  be  visited. 


3.1.3  Cost  function 


The  objective  is  to  complete  the  visit  requirement  (visiting  the  given  set  of  targets  once)  so  as  to 
maximize  a  reward  function.  The  reward  function  considered  is  the  overall  of  benefits  acquired  by 
the  vehicles: 


Ny  Nj'  Nj' 

Ji  =  ^^^[Cje~A^™=1^=A%x'%]xlij 

i= 1  1=1  j= 1 


(3.6) 
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Benefit  Function 


Time  [sec] 


(a)  Benefit  Function  Over  Time 


(b)  Benefit  Function  Over  Time  -  Increased  Decent  rate 


Figure  3.1:  Benefit  Function 


where  x\j  G  {0, 1}  be  a  binary  decision  variable  that  equals  1  if  vehicle  i  visits  target  j  at  stage 
l.  Since  the  benefit  function  diminishes  with  time,  we  formulate  a  ’’lost”  benefit  function  which 
is  given  by  Cj  —  Cje~A^m=1^k=itik.  The  ’’lost”  benefit  function  formulation  allow  the  definition 
of  a  cost  function  which  is  the  equivalent  to  the  reward  function  defined  above,  but  instead  of 
maximizing  the  reward  function  the  objective  is  to  minimize  the  cost  function.  The  cost  function 
mathematical  formulation  is  given  by 

Ny  Nt  N't 

id  -  cje-A  ss 

i=  1  1=1  j= 1 

The  constraints  of  the  problem  are  given  by: 

N’j1  Ny 

EXX- 

i=i  i=i 
Ny  Nt 

XIXX 

j= i  j= i 

Eq.  (3.8)  ensures  that  each  target  is  visited  once.  Eq.  (3.9)  ensures  that  in  each  stage  only  one 
vehicle  is  assigned  to  a  target. 

In  [31,  34,  26]  somewhat  similar  problem  involving  multiple  targets  and  vehicles  was  solved.  The 
cost  function  used  in  the  related  works  is  the  sum  of  the  path  lengths  of  all  the  vehicles  and  can 
be  formulated  as: 

Ny  Nt  N'j' 
i= 1  Z=1  j= 1 

In  these  cases  the  targets’  importance  is  identical  and  it  is  ignored  when  solving  the  problem.  In 
the  simulation  results’  section  we  use  this  cost  function  to  help  compare  the  performance  of  the 
proposed  algorithms. 


1,  3  =  1," 

■ ,  Nt 

(3.8) 

1,  V  /  =  1, 

■  ■  ■ ,  nt 

(3.9) 

(3.7) 
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The  solution  of  the  proposed  problem  includes  solving  two  integrated  subproblems:  task  assign¬ 
ment  and  motion  planning  problems.  In  order  to  minimize  the  cost  function,  the  task  assignment 
depends  on  the  underlying  motion  planning  for  the  path  length,  while  the  motion  planning  in  turn 
depends  on  the  task  assignment  for  the  order  of  the  vehicle’s  targets.  This  makes  the  problems 
coupled. 

The  motion  planning  problem  is  presented  next.  We  assume  that  each  vehicle  is  assigned  with 
a  list  of  an  ordered  set  of  targets,  made  by  the  task  assignment  algorithm.  The  goal  of  the  motion 
planning  is  to  derive  a  trajectory  for  each  vehicle  to  visit  all  targets  on  the  list,  avoid  collision  with 
obstacles  and  respect  the  vehicle  kinematic  constraints  described  in  section  3.1.1.  This  goal  should 
be  accomplished  in  minimum  time  in  order  to  maximize  the  benefit  acquired  from  each  target. 

3.2  Motion  planning 

In  order  to  represent  the  motion  planning  problem  in  the  form  of  a  decision  tree  it  is  necessary  to 
generate  nodes  representing  the  following:  targets  position,  vehicle’s  initial  configuration  and  the 
obstacles’  vertices  (under  the  assumption  of  polygonal  obstacles).  The  vehicle’s  path  will  either 
be  a  direct  path  (free  of  obstacles)  connecting  the  initial  configuration  and  the  set  of  targets,  or 
a  path  which  also  passes  through  some  of  the  obstacles  vertices,  in  case  a  direct  path  does  not 
exist.  Each  branch  of  the  tree  represents  the  described  path.  The  root  node  (initial  configuration) 
is  connected  to  all  of  the  targets  nodes  and  in  case  a  direct  path  is  not  feasible,  obstacles  nodes 
are  also  included.  The  goal  is  to  find  the  branch  that  provides  the  minimum  time  path. 

The  vehicles  in  this  work  are  modeled  as  Dubins  vehicles.  The  Dubins  paths  are  concatenation 
of  arcs  of  minimum  radius  turn  and  straight  line  segments  which  connect  an  initial  and  final 
configuration  (position  and  orientation).  The  optimal  path  can  be  achieved  by  checking  6  possible 
path  types  for  the  Dubins  vehicle  [11].  In  case  the  orientation  angle  in  the  final  configuration  is 
removed  the  number  of  possibilities  is  significantly  reduced.  This  is  known  as  the  relaxed  Dubins 
path  that  include  only  4  [4], 

An  important  benefit  obtained  by  using  the  relaxed  path  is  explained  using  the  following  exam¬ 
ple:  When  calculating  the  optimal  path  between  an  initial  (node  1),  final  (node  5)  configurations 
and  three  additional  unordered  configurations  (for  example:  obstacle  vertices)  located  between 
them  (nodes  2-4),  the  following  branches  of  the  tree  graph  are  generated:  A  branch  connecting 
nodes  1-2-3-4-5  and  a  branch  connecting  nodes  1-2-4-3-5.  In  the  relaxed  case  the  path  connect¬ 
ing  nodes  1  and  2  should  be  calculated  only  once,  as  it  is  independent  of  the  remaining  nodes. 
However,  in  the  non  relaxed  case,  the  arrival  angle  at  node  2  depends  on  the  order  of  the  following 
nodes  (node  3  or  4)  and  the  path  between  node  1  and  2  needs  to  be  calculated  separately  for  each 
branch.  This  attribute,  where  the  path  between  two  nodes  doesn’t  depend  on  the  following  nodes 
enables  us  to  pose  the  problem  as  tree  graph. 

In  order  to  find  the  relaxed  optimal  path  which  connects  the  initial  configuration  and  the 
targets’  set  and  doesn’t  intersect  with  obstacles,  it  is  necessary  to  search  the  tree  graph  presented 
above.  The  search  process  includes  calculating  the  relaxed  path  connecting  the  different  graph 
nodes  -  obstacles’  vertices  or  targets’  positions.  In  this  search  process  the  calculation  of  the  relaxed 
path  is  repeated  multiple  times,  especially  in  large  scale  scenarios.  In  case  real  time  scenarios 
are  considered,  the  use  of  the  relaxed  path  becomes  highly  beneficial  since  the  computational 
complexity  is  significantly  reduced,  compared  to  the  non  relaxed  case. 

In  this  work  an  existing  motion  planning  algorithm  is  used  as  a  subroutine  for  the  developed 
task  assignment  algorithm.  The  motion  planning  solution  can  be  achieved  by  one  of  two  algorithms: 
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Figure  3.2:  A  Tree  graph  for  2  vehicles  and  3  targets 


1)  An  Exhaustive  algorithm  which  explores  every  branch  of  the  search  tree  and  evaluates  every 
possible  visit  order  sequence  in  order  to  find  the  minimum  cost  one.  2)  An  A*  like  heuristic 
algorithm  which  uses  Euclidean  distances  as  a  heuristic  estimation  and  a  greedy  approach  to  find 
a  feasible  path.  The  algorithms  are  detailed  in  the  Appendix.  The  product  of  these  algorithms  is 
a  vehicle’s  feasible  path,  connecting  an  ordered  set  of  waypoints  among  obstacles. 

3.3  Task  assignment 

The  task  assignment  problem  is  represented  as  a  tree  (as  can  be  seen  in  Figure  3.2)  by  generating 
nodes  that  describe  a  combination  of  a  vehicle  V)  assigned  to  target  T).  Nodes  are  constructed 
until  all  combination  of  vehicles  and  targets  have  been  taken  into  account.  Each  node  is  associated 
with  a  cost.  For  example:  node  V{Tj  has  a  cost  that  equals  the  ’’lost”  benefit  granted  to  vehicle  i 
for  visiting  target  j.  The  ’’lost”  benefit  value  depends  on  the  time  that  it  takes  vehicle  i  to  reach 
target  j  from  its  initial  position,  which  in  turn  depends  on  the  vehicles’  path  length.  The  path  is 
obtained  using  a  motion  planning  subroutine  (described  in  section  3.2)  which  guarantees  feasible 
path  for  the  vehicles.  Since  the  motion  planning  subroutine  is  used  in  task  assignment  process, 
the  problem  solution  consists  of  a  primary  task  assignment  tree  search  which  in  turn  depends 
on  a  secondary  motion  planning  tree  search.  Two  algorithms  that  provide  solutions  to  the  task 
assignment  problem  are  proposed,  an  exhaustive  search  algorithm  and  a  greedy  algorithm.  The 
greedy  algorithm  provides  a  computationally  fast  solution  which  may  not  be  optimal,  and  the 
exhaustive  algorithm  explores  all  the  assignment  possibilities  to  derive  an  assignments  allocation 
with  minimum  cost  value. 

3.3.1  Exhaustive  Task  assignment  Algorithm 

The  proposed  algorithm  which  is  describe  in  Algorithm  1,  explores  every  branch  of  the  tree  to 
evaluate  all  the  assignments  possibilities.  The  input  to  the  algorithm  is  configurations  of  all 
the  vehicles,  obstacles  vertices  locations,  and  targets  positions.  The  algorithm  is  initialized  by 
computing  an  upper  bound  (line  1),  which  is  calculated  by  using  the  greedy  algorithm  described 
in  section  3.3.2.  The  upper  bound  is  used  to  prevent  unnecessary  exploration  of  branches  by 
bounding  the  branching  of  the  tree.  In  the  first  step  of  the  algorithm  the  first  layer  of  the  tree  is 
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generated,  all  possible  vehicle  target  combinations  are  described  as  nodes  and  the  associated  cost  is 
calculated  (lines  4-21).  After  the  initial  nodes  are  generated,  the  exhaustive  search  begins.  A  depth 
first  search  is  used  to  expand  a  branch  until  a  leaf  node  is  generated  (line  22-44).  When  a  node  is 
added,  the  accumulated  cost  of  the  branch  is  calculated  and  compared  with  the  upper  bound  (line 
28).  In  case  the  cost  is  higher,  the  branching  is  bounded.  If  a  leaf  node  is  reached  and  the  upper 
bound  is  higher  than  the  accumulated  cost,  the  upper  bound  is  updated  and  assignments’  order 
for  each  vehicle  is  stored  (line  34-38).  The  process  is  repeated  until  all  branches  have  been  either 
bounded  or  completely  explored.  The  algorithm  output  is  a  minimum  cost  ordered  set  of  targets, 
assigned  to  each  vehicle. 

3.3.2  Greedy  Task  Assignment  Algorithm 

The  proposed  algorithm  is  based  on  a  greedy  search  method  which  enables  to  quickly  find  an 
assignment  solution,  it  is  described  in  Algorithm  2.  Since  the  algorithm  is  greedy  by  nature, 
the  objective  function  used  is  the  reward  function  presented  in  Equation  3.6.  The  input  to  the 
algorithm  is  the  configurations  of  all  the  vehicles,  obstacles  vertices  locations,  and  targets’  positions. 
In  the  first  step  of  the  algorithm,  the  benefit  value  of  all  possible  pairs  of  vehicle-target  are 
computed  (lines  2-7).  The  vehicle  of  the  pair  with  the  highest  value  is  assigned  to  the  corresponding 
target  (line  8).  It  is  now  assumed  that  the  target  is  already  visited  and  that  the  vehicle  is  at  the 
target  position  (lines  9-12).  Next,  the  benefit  value  of  all  the  possible  pairs  is  evaluated  again  (lines 
14-19).  The  values  are  compared  in  order  to  find  the  pair  with  the  highest  one  and  to  relocate 
the  vehicle  to  the  corresponding  (now  visited)  target  (line  20-24).  This  procedure  is  repeated  until 
the  visit  requirements  are  fulfilled  (line  13).  The  output  of  the  algorithm  is  a  target  list  allocated 
to  each  vehicle  and  the  order  in  which  the  vehicles  need  to  visit  their  assigned  targets.  Unlike 
the  exhaustive  algorithm,  the  assignments’  order  does  not  guarantee  minimum  cost  since  only  a 
specific  (and  not  necessarily  the  optimal)  branch  of  the  tree  has  been  generated. 

3.4  Simulation  Results 

In  this  section,  sample-runs  are  used  to  demonstrate  the  presented  algorithms  and  to  explain  the 
different  parameters’  (vehicle  type,  benefit’s  descent  rate...)  influence  on  the  obtained  solution.  In 
all  the  different  scenarios,  the  task  assignment  algorithms  (exhaustive  or  greedy)  use  the  motion 
planning  subroutine  (exhaustive  or  heuristic)  based  on  the  relaxed  Dubins  distances,  hence  the 
coupling  of  the  problem  is  kept.  The  vehicles’  turn  radius  is  set  to  60  [m]  for  illustrative  reasons 
and  the  targets’  initial  benefit  values  can  be  set  between  1,000  to  10,000  (the  values  are  presented 
in  the  figures  next  to  each  target  as  numeral  between  1  and  10).  Each  scenario  has  a  summary 
table  presenting  the  acquired  and  lost  benefit,  and  the  solution  running  time. 

3.4.1  General  Scenario 

Figure  3.3  presents  a  scenario  in  which  2  aerial  vehicles  need  to  visit  4  targets  with  different  initial 
benefits.  The  scenario  solution  is  obtained  using  different  algorithm  setups  in  each  case.  Table  3.1 
summarizes  the  results  of  the  different  sample  runs.  The  highest  benefit  (lowest  lost  benefit) 
and  longest  running  time  were  gained  using  an  exhaustive  algorithm  setup  (Figure  3.3a).  When 
the  heuristic  motion  planning  is  used  instead  (Figure  3.3c),  the  cost  remains  the  same  but  the 
running  time  shortens.  For  the  case  of  a  greedy  task  assignment  and  heuristic  motion  planning 
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INPUT:  Vehicles  Configuration,  Target  position  and  and  Obstacles  vertices  positions. 

1:  Upper  bound= (Greedy  task  assignment  algorithm  solution) 

2:  Open  list=[  ]  >  Initialize  the  Open  list  -  stores  the  node 

to  be  examined  further. 

3:  Target  list =Tj  :  j  =  1 . . .  Nt  >  Initialize  a  list  containing  targets  to  be 

visited. 

4:  for  (Vi  :  i  =  1  to  Ny)  do  >  Generate  the  first  layer  of  the  tree. 

5:  for  ( Tj  :  j  =  1  to  Nt)  do 

6:  Current  node-Vehicle=V) 

7:  Current  node-Target=T) 

8:  Current  node-Path(V))=Path  Length(V),Tj)  >  Calculate  the  path  length  between  the 

current  vehicle  and  target. 

9:  Current  node-Cost=Lost  Benefit  function(Current  node-Path(V))) 

>  Calculate  the  Lost  benefit  value. 

10:  Current  node- Target  list=Target  list \Tj  >  Remove  the  current  target  from  the  tar¬ 

get  list  of  the  expanded  branch. 

11:  Current  node-vehicle  target  list(V))=  Tj  >  Store  the  targets  assigned  to  each  vehicle 

12:  if  Current  node-Target  list  =  0  then  >  If  a  leaf  node  is  reached  and  its  cost  is 

lower  than  the  upper  bound:  update  the 
upper  bound  and  store  the  vehicle’s  as¬ 
signed  targets 

13:  if  Current  node-Cost  <  Upper  bound  then 

14:  Upper  bound=Current  node-Cost (R) 

15:  Vehicle  target  list(V))=  Tj 

16:  end  if 

17:  else  >  Add  the  current  node  to  Open  list  for 

further  exploration. 

18:  Open  list  =Open  list  U  Current  node 

19:  end  if 

20:  end  for 

21:  end  for 

Algorithm  1:  Task  assignment  Exhaustive  search  algorithm 
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22 

while  Open  list  /  0  do 

>  Perform  a  Depth  First  Search 

23 

Parent  node=Open  list  (last  inserted  node) 

[>  Choose  the  last  inserted  node  as  the  cur¬ 

rent  parent  node  for  further  exploration 

24 

for  (Vi  :  i=Parent  node- Vehicle  to  Ny)  do 

o  Consider  all  possible  children  nodes. 

25 

for  Tj  £  Parent  node- Target  list  do 

26 

Child  n o de- Pat h ( V ) = Pat h  Length^, Tj) 

+  Parent  node-path(U) 

o  Calculate  the  overall  distance  trav¬ 

eled  by  vehicle-V;  in  the  explored 

branch. 

27 

Child  node-Cost=Parent  node-Cost 

[>  Calculate  the  cost 

+  Lost  Benefit  function(Child  node-path(Vj)) 

28 

if  Child  node-Cost  <  Upper  bound  then 

>  If  the  cost  is  higher  than  the  upper 

bound,  the  branching  is  bounded 

29 

Child  node-vehicle=V 

30 

Child  node-target=Tj 

31 

Child  node-Target  list—Parent  node-Target  list\Tj 

32 

Child  node-vehicle  target  list (V^ )=  [Parent  node-vehicle  target  list(V),  Tj] 

33 

if  Child  node-Target  list  =  0  then 

>  If  a  leaf  node  has  been  reached:  update 

the  upper  bound  and  store  the  vehi¬ 
cle’s  assigned  targets. 

34 

Upper  bound=Child  node- Cost (U) 

35 

Vehicle  target  list(Vj)=Child  node-vehicle  target  list (V^) 

36 

else 

37 

Open  list  =Open  list  U  Child  node 

>  Add  the  child  node  to  Open  list  for  fur¬ 

38 

end  if 

ther  exploration. 

39 

end  if 

40 

end  for 

41 

end  for 

42 

Open  list  =Open  list  \  Parent  node 

>  Remove  the  evaluated  Parent  node  from 

43 

end  while 

the  Open  list 

OUTPUT:  Vehicle  target  list  -  The  targets  assigned  to  each  vehicle  and  the  required  visitation  order. 

Algorithm  1:  Task  assignment  Exhaustive  search  algorithm  (Continue) 
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INPUT:  Vehicles  Configuration,  Target  position  and  and  Obstacles  vertices  positions. 

1:  Target  list =Tj  :  j  =  1 . . .  Nt  >  Initialize  a  list  containing  targets  to  be 

visited. 

2:  for  (Vi  :  i  =  1  to  Ny)  do  >  Calculate  the  benefits  of  all  possible 

vehicle-target  pairs. 

3:  for  (Tj  :  j  =  1  to  Nt)  do 

4:  Vehicle-target-path(V; ,Tj)=Path  Length(V,Tj) 

5:  Vehicle-target-benefit ( V  ,7} )= Benefit  function(Vehicle-target-path(V,T))) 

6:  end  for 

7:  end  for 

8:  [Ui,Tj]  <—  max( Vehicle-target-benefit (1^ ,Tj))  >  Find  the  pair  with  the  highest  benefit 

value. 

9:  Vehicle  target  list (V)= [Vehicle  target  list( ) ,  Tj]  >  Add  the  target  to  the  vehicle  assigned 

targets  list. 

10:  Target  list=Target  list\Tj  >  Remove  the  target  from  the  target  list. 

11:  Vehicle  position(V()=Target  position(Tj)  >  Update  the  vehicle  position. 

12:  Vehicle  path( R ) = Vehicle  path(V) 

+  Vehicle-target-Pathf V  ,Tj )  >  Calculate  the  overall  distance  traveled  by 

the  vehicle  . 

13:  Total  beneht—max( Vehicle-target-benefit ( V  ,T() )  >  Store  the  accumulated  benefit. 

Algorithm  2:  Task  assignment  Greedy  search  algorithm 
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14:  while  Target  list  /  0  do 


>  Repeat  the  process  until  all  targets  have 
been  visited. 

15:  for  (Vi  :  i  =  1  to  Ny)  do  o  Calculate  the  benefits  of  all  possible 

vehicle-target  pairs. 

16:  for  ( Tj  €  Target  list)  do 

17:  Vehicle-target-path (V{,Tj ) =Path  Length(V),T))  +  Vehicle  path(V)) 

18:  Vehicle-target-beneht(Vj;T))=Benefitfunction(Vehicle-target-path(V),^j)) 

+  Total  benefit 

19:  end  for 

20:  end  for 

21:  [V^,2j]  l—  max(  Vehicle-target-benefit (Vi.Tj))  >  Find  the  pair  with  the  highest  benefit 

value. 

22:  Vehicle  target  list  (V)= [Vehicle  target  list  (Vi),  Tj\  >  Store  the  vehicle  assigned  target. 

23:  Target  list=Target  list \Tj  >  Remove  the  target  from  the  target  list. 

24:  Vehicle  position(V))=Target  position(T))  o  Update  the  vehicle  position. 

25:  Vehicle  path ( Vt ) = Vehicle  path(V)) 

+  Vehicle-target-path(V),Tj)  >  Calculate  the  overall  distance  traveled  by 

the  vehicle. 

26:  Total  benefit=iriax(Vehicle-target-benefit(V),T)))  >  Store  the  accumulated  benefit. 

+  Total  benefit 

27:  end  while 

OUTPUT:  Vehicle  target  list  -  The  targets  assigned  to  each  vehicle  and  the  required  visitation  order. 
Algorithm  2:  Task  assignment  Greedy  search  algorithm  (Continue) 
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0  200  400  600  800  1000  0  200  400  600  800  1000  0  200  400  600  800  1000 

x  (Meters)  x  (Meters)  x  (Meters) 

(a)  Exhaustive  Task  assignment  al-(b)  Greedy  Task  assignment  algo-(c)  Exhaustive  Task  assignment  al¬ 
gorithm;  Exhaustive  Motion  plan-rithm;  Heuristic  Motion  planninggorithm;  Heuristic  Motion  plan¬ 
ning  algorithm  algorithm  ning  algorithm 

Figure  3.3:  General  scenario  -  2  Vehicles  and  4  Targets  among  obstacles 


Figure  # 

Algorithms  used 

Initial 

Benefit 

Acquired 

Benefit 

Lost 

Benefit 

Overall 

Distance 

Solution 
Time  [seel 

Exhaustive 

3.3a 

TA 

Exhaustive 

MP 

21000 

9814 

11186 

1930 

58.1 

3.3b 

Greedy  TA 

Heuristic  MP 

Exhaustive 

21000 

9397 

11603 

2438 

1.05 

3.3c 

TA  Heuristic 
MP 

21000 

9814 

11186 

1930 

17.3 

Table  3.1:  Different  Initial  Benefits  Scenario 


algorithms  (Figure  3.3b),  the  lowest  benefit  (highest  cost)  and  shortest  running  time  are  attained. 
The  solution  presented  in  Figure  3.3a  shows  that  the  vehicles  are  generally  first  heading  towards 
targets  with  high  priority  while  taking  into  account  targets  with  lower  priority.  Since  the  benefit 
diminishes  with  time,  the  vehicle  does  not  head  directly  towards  the  high  priority  targets  but  also 
passes  through  low  priority  targets  which  are  closer  to  its  location  (upper  vehicle  on  Figure  3.3a). 
In  Figure  3.3b  the  upper  vehicles  head  directly  to  target  6  (initial  benefit  value=6)  and  skip  target 
1  since  in  this  case  the  task  assignment  algorithm  is  greedy  by  nature.  In  the  scenarios  presented  in 
sections  3.4.2  and  3.4.3  the  exhaustive  algorithms’  setup  yields  the  same  results  as  the  exhaustive 
task  assignment  algorithm  and  heuristic  motion  planning  algorithm  setup,  therefore  only  latter 
setup  is  presented.  Even  though  the  results  presented  in  these  sections  are  identical,  it  can  be 
shown  that  in  certain  cases  the  exhaustive  algorithms’  setup  provides  better  results  however,  the 
solution  running  time  becomes  longer. 
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x  (Meters)  x  (Meters)  x  (Meters) 

(a)  Greedy  Task  assignment  algo-(b)  Exhaustive  Task  assignment (c)  Exhaustive  Task  assignment  al- 
rithm;  Heuristic  Motion  planningalgorithm;  Heuristic  Motion  plan-gorithm;  Exhaustive  Motion  plan- 
algorithm  ning  algorithm  ning  algorithm;  Minimize  the  sum 

of  the  overall  distance  travelled 

Figure  3.4:  Equal  Benefits  scenario  -  2  Vehicles  and  4  Targets  among  obstacles 

3.4.2  Equal  Benefit  Scenario 

The  scenario  shown  in  Figure  3.4  is  similar  to  the  scenario  shown  in  Figure  3.3,  only  the  targets’ 
initial  benefit  is  equal.  Since  the  targets’  priority  is  identical,  we  expect  the  results  to  be  similar 
to  the  case  where  the  cost  function  objective  is  to  minimize  the  overall  distance  travelled  by 
the  vehicles  (Equation  3.10).  In  the  results  summarized  in  table  3.2,  the  highest  benefit  (lowest 
cost)  is  obtained  by  the  setup  of  the  exhaustive  task  assignment  algorithm  (Figure  3.4b).  The 
overall  distance  is  the  same  as  in  the  case  of  using  the  cost  function  which  minimizes  the  sum 
of  the  distance  travelled  (Figure  3.4c),  as  we  expected.  As  in  the  previous  scenario,  the  solution 
running  time  has  the  same  tendency,  when  using  a  greedy  and  heuristic  algorithms’  combination 
the  shortest  running  time  is  gained  and  with  an  exhaustive  algorithms  combination  the  longest 
running  time  is  gained.  This  tendency  remains  the  same  through  all  the  presented  scenarios. 

3.4.3  Comparing  Exhaustive  and  Greedy  Task  Assignment  Algorithms 

A  scenario  of  4  targets  and  2  vehicles  is  presented  in  Figure  3.5.  The  results  of  the  scenario 
(table  3.3)  support  the  claim  that  the  exhaustive  algorithm  always  provides  better  or  equal  results 
compared  to  the  greedy  algorithm.  This  can  be  seen  in  table  3.1  and  table  3.4  as  well.  The  main 
advantage  of  the  greedy  algorithm  is  its  low  computational  time,  which  makes  it  suitable  to  real 
time  applications.  In  cases  where  both  the  exhaustive  task  assignment  algorithms  and  the  motion 
planning  algorithm  are  used,  the  best  solution  coded  in  the  tree  is  obtained  and  the  lowest  cost 
assignments  allocation  and  vehicles’  paths  are  provided. 

3.4.4  Benefit  Time  Dependency 

A  simple  scenario  of  one  vehicle  and  two  targets  (initial  benefit  of  10  and  5)  is  used  in  Figure  3.6  and 
Figure  3.7.  The  time  dependency  can  be  easily  explained  using  this  two  scenarios,  in  Figure  3.6 
the  vehicle’s  path  passes  through  target  5  even  though  it  causes  the  vehicle  to  extend  its  path 
toward  target  10.  This  happens  because  the  time  it  takes  to  get  to  target  5  is  very  short  compared 
to  target  10,  and  in  order  to  minimize  the  ’’lost  benefit”  of  the  two  targets  it  is  better  to  first  pass 


35 


Figure  # 

Algorithms  used 

Initial 

Benefit 

Acquired 

Benefit 

Lost 

Benefit 

Overall 

Distance 

Solution 
Time  [sec] 

3.4a 

Greedy  TA 
Heuristic  MP 

Exhaustive 

4000 

1399 

2601 

3353 

1.3 

3.4b 

TA  Heuristic 
MP 

Exhaustive 

TA  Heuristic 
MP  (Sum  of 
path  length 

cost  function- 
Eq.  3.10) 

4000 

1623 

2377 

2075 

73.7 

3.4c 

4000 

1623 

2377 

2075 

73.7 

Table  3.2:  Equal  Initial  Benefits  Scenario 
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(a)  Greedy  Task  assignment  algorithm;  Heuristic(b)  Exhaustive  Task  assignment  algorithm; 
Motion  planning  algorithm  Heuristic  Motion  planning  algorithm 

Figure  3.5:  Comparison  Between  Exhaustive  and  Greedy  Task  Assignment  Algorithms 
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Figure  # 

Algorithms  used 

Initial 

Benefit 

Acquired 

Benefit 

Lost 

Benefit 

Overall 

Distance 

Solution 
Time  [sec] 

3.5a 

Greedy  TA 
Heuristic  MP 

Exhaustive 

24000 

10027 

13973 

2514 

0.9 

3.5b 

TA  Heuristic 
MP 

24000 

13504 

10496 

1487 

13.6 

Table  3.3:  Comparing  Exhaustive  and  Greedy  Task  Assignment  Algorithms 


Figure  3.6:  Scenario  1  -  Exhaustive  TA  ;  Exhaustive  MP  and  Greedy  TA  ;  Heuristic  MP 

through  target  5.  In  Figure  3.7,  however,  the  time  it  takes  to  get  to  target  5  is  still  shorter  than 
the  time  it  takes  to  get  to  target  10  but  since  target  10  is  now  located  closer  to  the  vehicle  it  is 
better  to  first  pass  though  target  10.  These  two  scenarios  demonstrate  how  the  arrival  time  of  the 
vehicle  to  each  target  influences  the  task  assignment  process.  In  both  of  these  small  sized  simple 
cases  the  greedy  algorithm  provides  an  identical  result  to  the  exhaustive  algorithm’s  result  and 
the  vehicle’s  path  remains  the  same. 

3.4.5  Benefit  Descent  Rate 

The  scenario  of  two  targets  and  one  vehicle  presented  in  Figure  3.8  helps  illustrate  the  influence 
of  the  descent  rate  on  the  obtained  results.  In  this  scenario,  the  time  it  takes  the  vehicle  to  get  to 
target  3  is  200  [sec]  from  its  initial  position  and  the  time  it  takes  the  vehicle  to  get  to  target  10  is 
500  [sec]  from  the  same  position.  By  increasing  the  value  of  the  descent  rate,  the  benefit  rapidly 
diminishes  as  time  progresses.  In  Figure  3.8b  the  descent  rate  is  increased  by  5  times  compared  to 
Figure  3.8a,  which  causes  the  change  in  the  vehicle  assignments’  order.  Before  the  increase  of  the 
descent  rate,  the  benefit  of  target  10  is  significantly  higher  than  that  of  target  3  (upper  red  bullet 
compared  to  lower  red  bullet  in  Figure  3.1a)  but  after  the  descent  rate  is  increased,  the  targets 
have  similar  benefits  (as  can  be  seen  by  the  red  bullets’  vertical  position  in  Figure  3.1b).  Since  the 
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Figure  3.7:  Scenario  2  -  Exhaustive  TA 


Exhaustive  MP  and  Greedy  TA  ;  Heuristic  MP 


Figure  # 

Algorithms  used 

Initial 

Acquired 

Lost 

Overall 

Benefit 

Benefit 

Benefit 

Distance 

3.6 

Exhaustive 

TA 

Exhaustive 

MP 

15000 

6571 

8529 

1371 

3.6 

Greedy  TA 

Heuristic  MP 

15000 

6571 

8529 

1371 

3.7 

Exhaustive 

TA 

Exhaustive 

MP 

15000 

6634 

8366 

1333 

3.7 

Greedy  TA 
Heuristic  MP 

15000 

6634 

8366 

1333 

Table  3.4:  Scenario  1  &  Scenario  2 
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(a)  Exhaustive  Task  assignment  algorithm;  (b)  Exhaustive  Task  assignment  algorithm; 
Exhaustive  Motion  planning  algorithm  Exhaustive  Motion  planning  algorithm 

Figure  3.8:  Benefit  Decent  Rate 


Figure  # 

Algorithms  used 

Initial 

Acquired 

Lost 

Overall 

Benefit 

Benefit 

Benefit 

Distance 

3.8a 

Exhaustive 

TA 

Exhaustive 

MP 

13000 

6528 

6472 

1414 

Exhaustive 

3.8b 

TA 

Exhaustive 

MP 

13000 

1004 

11996 

1063 

Table  3.5:  Benefit  Decent  Rate 


benefit  that  the  vehicle  can  gather  in  target  10  is  smaller  than  the  one  in  target  3,  it  is  preferable 
to  change  the  targets’  visitation  order  as  can  be  seen  in  Figure  3.8b.  The  results  summarized  in 
table  3.5  emphasize  the  influence  of  the  descent  rate  as  the  benefit  acquired  in  Figure  3.8a  is  much 
higher  than  the  one  in  Figure  3.8b.  The  benefit  decent  rate  does  not  only  influence  the  benefit 
gathered  but  may  also  influence  the  assignments’  order  as  presented  above. 


Appendix 

The  task  assignment  algorithm  uses  a  motion  planning  subroutine.  The  motion  planning  sub¬ 
routine  includes  two  types  of  algorithms.  These  algorithms  are  based  on  a  search  process  that 
explores  the  tree  described  in  section  3.2.  The  two  algorithms  are  described  next,  an  exhaustive 
search  algorithm  that  finds  the  minimum  cost  branch,  and  a  heuristic  greedy  algorithm  that  finds 
a  feasible  solution  in  a  shorter  computational  time. 
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Heuristic  motion  planning  algorithm 

Given  a  set  of  ordered  targets,  vehicle’s  initial  configuration  and  the  obstacles’  vertices  position, 
the  algorithm  performs  an  A*-like  greedy  search  based  on  Euclidean  distances  heuristics.  The 
algorithm  steps  are  as  follows: 

1.  The  vehicle  is  assumed  to  be  located  and  oriented  according  to  the  initial  configuration.  The 
first  target  is  set  to  be  the  current  target. 

2.  Calculate  an  obstacles  free  relaxed  path  connecting  the  vehicle’s  current  configuration  and 
the  current  target.  If  such  a  path  exist  skip  to  step  7. 

3.  Calculate  an  obstacles  free  relaxed  paths,  connecting  the  current  configuration  and  all  of  the 
the  obstacles  vertices. 

4.  Find  the  vertex  with  the  following  property:  the  sum  of  the  relaxed  path  length  (connecting 
the  vertex  and  the  vehicle’s  current  configuration)  and  the  Euclidean  distance  (connecting 
the  vertex  and  the  current  target)  is  minimum. 

5.  The  vehicle  is  assumed  to  be  located  at  the  vertex  with  the  described  property  and  the 
orientation  angle  becomes  the  vertex  arrival  angle. 

6.  Return  to  step  2. 

7.  The  vehicle  is  assumed  to  be  located  at  the  current  target  and  the  orientation  angle  is  set  to 
be  the  arrival  angle. 

8.  The  next  target  to  visit  on  the  targets’  set  becomes  the  current  target. 

9.  Repeat  steps  2-8  until  the  entire  targets’  set  is  visited. 


The  algorithm  output  is  a  vehicle  trajectory,  represented  by  an  ordered  set  of  nodes  (including 
targets  and  obstacles’  vertices)  that  need  to  be  visited  using  relaxed  paths. 

Exhaustive  motion  planning  algorithm 

The  algorithm  explores  every  branch  of  the  search  tree  and  evaluates  every  possible  visit  order 
sequence  in  order  to  find  the  minimum  cost  branch.  The  algorithm’s  input  is  a  set  of  ordered 
targets,  vehicle’s  initial  configuration  and  the  obstacles’  vertices  position.  The  algorithm  steps  are 
as  follows: 

1.  Calculate  the  initial  upper  bound,  using  the  heuristic  algorithm. 
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2.  An  OPEN  list  is  generated  to  store  the  nodes  that  will  be  examined  as  the  next  node  to 
visit. 

3.  The  initial  configuration  is  entered  to  OPEN  as  a  node. 

4.  The  node  with  the  lowest  cost  in  OPEN  (relaxed  path  connecting  the  initial  configuration 
and  the  node)  is  selected  as  the  current  node. 

5.  The  neighbors  of  the  current  node  that  can  come  after  it  in  the  visit  order  are  examined. 
Their  estimated  distance  is  calculated.  It  is  defined  as  the  sum  of  the  following: 

(a)  Cost  of  the  selected  node. 

(b)  Relaxed  path  connecting  the  selected  node  and  the  neighbor. 

(c)  Euclidean  distance  between  the  neighbor  and  the  current  target  to  visit. 

(d)  The  total  Euclidean  distance  which  connects  the  current  target  and  the  remaining  tar¬ 
gets  to  visit  in  the  targets’  set  in  the  required  order. 

6.  The  neighbors  with  an  estimated  distance  which  is  lower  than  the  current  upper  bound  are 
added  to  OPEN  as  new  nodes. 

7.  All  of  the  new  nodes  added  to  OPEN  are  examined. 

(a)  In  case  a  new  node  is  the  next  target  to  visit,  the  target  is  marked  as  visited  in  the 
current  explored  branch. 

(b)  In  case  a  new  node  is  the  last  target  to  visit  and  the  entire  targets’  set  is  visited  in  the 
required  order,  a  leaf  node  of  the  branch  is  reached  and  the  entire  branch  is  explored. 

i.  The  upper  bound  is  updated  to  the  relaxed  path  total  length  described  by  the  nodes 
in  the  branch  and  the  visit  order  of  the  nodes  is  stored. 

8.  The  current  node  is  removed  from  OPEN  since  all  the  neighbors  have  been  evaluated. 

9.  This  process  is  repeated  until  the  OPEN  list  is  empty. 

The  algorithm  output  is  identical  to  the  heuristic  algorithm  output  described  above. 
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